return to  /~wcjones/Classes/

= ... =

Month Year
Sun
Mon
Tue
Wed
Thu
Fri
Sat
 
 

...



Return to my -

Home Page
Class Page
Testing Page

CGS2557: Internet Programming;  Perl Guide...


OVERVIEW

Perl (Practical Extraction and Report Language) is a general-purpose, 
high-level programming language created by Larry Wall.  Perl is derived 
from, and has many features of C language, awk, sed, the Unix shell, and 
other computer languages.  Perl has evolved to include variable 
references, complex data structures, packages, modules, object-oriented 
programming, and threads.  Perl's powerful text processing, text 
indexing, and pattern matching makes it the language of choice for Web 
CGI development.  This chart covers the most basic of Perl commands that 
are common to Unix (Linux), Windows & Macintosh.


PLATFORM    SOURCE

Unix/Linux  http://www.perl.com/CPAN-local/ports/index.html 
Windows     http://www.activestate.com/
Macintosh   http://www.macperl.com/



COMMAND LINE

Perl [-sTuU] [hv] [-V[:configvar]] [-cw]-d[debugger]
[-D[number/list]]  [-pna]   [-Fpattern]   [-|[octal]
[-0[octal]][-Idir] [-m[-]module][-M[-]'module...'] [-P]
[-S] [-x[dir]][-I[extension]] [-e 'command'] [--] 
[programfile] [argument]...

* Run a Perl program on the command line:
  perl -w scan.pl
  
  In above example, the -w enables programming error messages 
  (you should always use -w)

* Write Perl commands on the command line:
  perl -e "print 'Hello, Larry!'"



COMMAND LINE SWITCHES

SWITCH          MEANING

--              Ends switches

-0[number]      Specify record separator ($/) as octal number

-a              Turns on autosplit mode when used with -n or -p

-c              Check syntax of script, exit without running (will 
                execute BEGIN blocks)

-d              Invoke Perl debugger

-d:filename     Run script under control of debugging/tracing module

-Dopts          Set debugging flags.  Flags can be set by summing 
                numbers or by letters


1        p      Tokenixing and parsing
2        s      Stack snapshots
4        l      Lable stack processing
8        t      Trace execution
16       o      Object method lookup
32       c      String/numeric conversions
64       p      Print preprocessor command for -P
128      m      Memory allocation
256      f      Format processing
512      r      Regular expression processing
1024     x      Suntax tree dump
2048     u      Tainting checks
2096     L      Memory leask (no longer supported)
8192     H      Hash dump
16384    X      Scratchpad allocation
32768    D      Cleaning up

-e              Runs Perl script from command line

-F              Specify pattern to split if -a is specified

-h              Print summary of command line options

-i[extension]   Set files processed by <> construct to be edited in 
                place.  If extension is supplied, it will be used 
                in the name of the backup copy of each file.  
                Otherwise, no backup takes place.

-Idirectory     Prepend directory to @INC

-l[octalnumber] Enable automatic line-end processing

-m[module]      Invoke "use module" before executing script

-M[module]      Invoke "no module" before executing script

-p              Assume loop around script to iterate over filename 
                arguments.  Lines are printed.

-P              Run Perl scrip through C preprocessor before compilation

-s              Switch parsing

-S              Search for script using PATH environment variable

-T              Force taint checking

-u              Dumps core after compiling

-U              Allow unsafe operations

-v              Prints Perl version and patch level

-V              Prints summary of configuration

-V:name         Print value of named configuration

-w              Enable extra warning messages (you should always use -w)

-xdirectory     Run script embedded in message



PERL SYNTAX & STRUCTURE

* Perl syntax is very flexible; the guiding priciple is:
  "There's More Than One Way To Do It."  
  Perl programmers refer to this as
  TMTOWTDI, pronounced "Tim-toady".



COMMENTS

* Comments begin with # and can appear almost anywhere a Perl statement 
  can appear: #-This is a comment.

  $foo = 3.141592654; #set $a to the value of pi

#!
* When prefaced with #! (pronounced "shebang"), the command line options 
  may be specified on a line within a Perl script, except -m and -M.

  #!/usr/bin/perl -wT
  #!C:\perl\bin\perl -wT



VARIABLES

* Variable names can be from 1 to 251 characters. 

* Variable names must begin with a letter or underscore, followed by upper or 
  lowercase letters, underscores or digits.

* Variable names are case sensitive ($salesprice is not the same
  as $salesPrice).

* Varialbe names are normally lowercase:
  $sales_price = $price-$price * $discount

* Variable names are always preceded by one of the following three symbols 
  $, @, or %
  $ - Scalar Variables - Hold a single value
  @ - Arrays
  % - Hashes



NUMBERS

* Specify numbers in integer or floating point format (positive or negative):
  12              #integer
  3.141592654     #floating point number
  -2.34e-5        #negative number raised to the negative 5th power


* Use underscores to enhance readability:

  12_456_452.45   #equal to 12456452.45


* Use prefix 0x to specify hexadecimal numbers
* use prefix 0 to specify octal numbers 

  0x4e            #hexadeciamal
  0577            #octal



STRINGS

* Delimit strings with single quotes, double quotes, or backquotes 
  (grave accents).  Doublequoted and backquoted strings are interpolated.

  'abcdefg'
  "hellow, world\n"
  `/bin/ls`

* Strings can have no characters at all ("", '' the empty string), or they 
can grow to fill all memory.  Within the double-quoted strings, variables 
and the following backslashes are interpreted

  my $company = "edelweiss";   #string
  print "\uhello, \u$company"  #prints Hello, Edelweiss
  my $greeting = "hallo";
  my $location = "there";

  my $salutation = "$greeting $location\n";
  # $salutation now contains "hallo there\n"; where \n is the 
  newline character

  print ($salutation);  # prints out "hallow there" and then a newline.

  $salutation = "${location}, ${greeting} ween.\n";
  # $salutation now contains "there, halloween\n"



LOGIC

* Perl uses scalars for boolean expresion evaluation:
  Numbers       0 is false.  All other numbers are true.
  Strings       "" and "0" are false. All other strings are true, 
                including "00","-0", "1", etc.
  References    All references are true.
  Undef         Undef is Undefined.
  Backticks     ` are always true.



SCALARS

* Declare varibables by listing them.
  my $age;

* Assign values to variable using the equal (=)sign.
  $age = 12.2;

* Combine declaration and assign as follows:
  my $age        =12.2           #number
  my $company    ="edelweiss";   #string
  my $underfined = undef;        #undefined value
  my $twinage    = \$age;        #reference to $age

* Assign in list form (see list section for more info):
  my ($tony_company, $cindy_company)=("edelweiss", "CTS");



ARRAY VARIABLES

* Arrays begin with @ and contain lists.  Elements in a list are 
  in a specific order.  Arrays index scalars, starting at 0 (zero).

* Create and initialize array:
  my @foo = ();                   #empty array
  my @bar = (1,"two",3,4.0,5,6);  #six element array

* Access scalars in an array by using indexes:
  my @baz = ();

  $baz[0] = 3.14159;
  $baz[1] = 'pi';
  $baz[2] = undef;

  $foo[0]   first scalar in $foo
  $foo[1]   second scalar in $foo
  $foo[2]   third scalar in $foo

  $foo[$#foo] last scalar in $foo

* Use negative indexes to get elements at the end of an array:
  $foo[-1]  last scalar in $foo
  $foo[-2]  2nd to last scalar in $foo
  $foo[-3]  3rd to last scalar in $foo



QUOTING SYNTAX

* The quoting syntax give an easy way to syntactically create strings and 
  lists.  The qw() operator is the most common.  It creates a list by 
  splitting on thord boundaries. 

  my @stooges = ("Moe","Larry","Curly","Iggy"); #creates a list of Stooges
  my@stooges = qw(Moe Larry Curly Iggy);  #Equivaltent list of Stooges

* Note that the quoted strings are not separated by commas.


CUSTOMARY       GENERIC          MEANING

''                q//            quote sting with no interpolation
""                qq//           quote string with interpolation

``                qx//           quote string with interpolation, pass 
                                 to operating system to process as a 
                                 command, return results as a string              

()                qw//           word list

//                m//            pattern match

s///              s///           substitution

y///              tr///          translation


"HERE" DOCUMENTS

* Here documents are used to embed lines of text in a Perl 
  script, for printing or variable assignment.

  $foo = <<EOT
  testing 
  one
  two
  three
  $four
  EOT

* Variable interpolation is "on" by default.  You turn it off 
  by making it single quoted.

  print <<"EOP"                  
  interpolation supported:
  $foo is related to $bar
  EOP

* Commands will be executed if the ending-string uses backticks.
  $sommand_results = <<`END_OF_COMMANDS`
  ls -al
  ps -aux
  END_OF_COMMANDS



LIST VARIABLES

* A list is a sequence of scalars, enclosed in parentheses and 
  separated by commas.

  ()
  (12)
  (1,2,3)
  ($foo,$bar,$baz)
  (1,$foo,@bar,"four")   #@bar is expanded

* Lists can be assigned:
  ($foo,$bar,$baz) = (1,2,3);

  is the same as saying:  $foo = 1; $bar = 2, $baz = 3;

* Special syntax for lists of strings -qw() takes whitespace 
  characters and changes to strings:

  qw( foo bar baz )
  qw/ foo bar baz/

  are all the same as:  ('foo','bar','baz')

* Lists can be subscripted and sliced.  Slices return lists that are 
  portions of other lists.

  my @list = qw (a b c d);
  @list[0..2]   #returns ("a","b","c")
  @list[3,0]    #returns ("d","a")
  @list[2]      #returns ("c")(a one-element list)
  $list[2]      #returns "c" (a single scalar)

* Subscripts and slices can also be used as lvalues, 
  to modify the elements in an array or hash.  
  my@colors = qw (red white blue);
  $colors[2] = "pink"  #@colors is now ("red","white","pink")
  $colors[3] = "green" #@colors is now ("red","white","pink","green")
  @colors[0,2] = ("black","brown")  
  # @colors is now ("black","white","brown","green")

* Use the range operator for convenient creation of lists:
  (1..5)  #same as (1,2,3,4,5)
  ('A'..'F')  #same as ('A','B','C','D','E','F')

* The join() operator combines elements of a list into a 
  single string.
  my @students = qw (Blair Jo Tootie Natalie);
  my $students = join(","@students );  
  # $students is "Blair, Jo, Tootie, Natalie"

* Conversely, split() breaks up a sting into a list based on 
  a string or regular expression.
  my $band = "John, Peul, George, Ringo";
  my @members = spilt(",",$band);  
  # @members is now ("John","Paul","George","Ringo");


LIST INTERPOLATION

* You cannot sort a hash
  (1,2,3,(4,5,6))  #expands to (1,2,3,4,5,6)
  @foo = (1,2,3);
  @bar = (4,5,6);
  (@foo,@bar)  #expands to (1,2,3,4,5,6)



BASIC ARRAY OPERATIONS

Count scalar in an array               $count = scalar(@foo);
Get index of last scalar in array      $index = $#foo;
Copy an array                          @bar = @foo;
Sort an array                          @bar - sort( @foo );
Reverse an array                       @foo = reverse( @foo );




ARRAY STACK OPERATIONS

Push scalar onto end of array          push( @array, $value );
Push list onto end of array            push( @array, @list );
Pop scalar off the end of array        $value = pop( @array );
Push scalar onto front of array        unshift( @array, $value );
Push list onto from of array           unshift( @array, @list );
pop scalar off of front of array       $value = shift( @array );


SCALAR vs. LIST CONTEXT

* All Perl operations are in either scalar or list context.  Many 
  operators and functions behave fidderently depending on their context.

* For example, localtime() will return a nicely formatted date/time 
  string in scalar context, and a list of date/time elements in a 
  list context.
  
  my $time = localtime(time);       #scalar context
  $time = scalar(localtime(time));  #same 

  my ($sec,$min,$hour,$mday,$mon,$year,$wday,$yday,$isdst);
  localtime(time);  #list context
  
  A function can tell whether it's being called in list or scalar 
  context by using the wantarray() funtion.



HASH VARIABLES

* Hashed (associative arrays) begin with a % and index scalars by strings.

Create a hash                        my %foo = ();

Initialize hash by elements          my %foo = ();
                                     $foo{'blue'} = "water";
                                     $foo{'green'} = "grass";
                                     $foo{'yellow'} = "dandelion";

Copy a hash                          %bar = %foo;

Get list of hash keys                @keys = keys( %foo );

Get list of hash values              @values = values( %foo );

Flatten hash into an array           @foo = %bar;

Merge two hashes                     %hash = ( %this, %that );

Iterate over hash key-value pair     while ( ($key,$val) = each(%ENV)) {
                                          print("key = $val\n");
                                          }

Delete a hash key-value pair         delete($foo{'blue'});

Check if a key-value pair exists     exist($foo{$key})

Check see if value of key-value     
pair is defined (i.e. exists,        defined($foo{key})
and not equal to undef)

Get count of hash key-value pairs    $count = scalar(keys(%foo));

Get hash bucket allocation info      $info = scalar(%foo);

Preallocate hash buckets             keys(%foo) = 1000;



HASH SLICES

* Hash slices work like array slices

  my %colors;
  $color{ "banana"} = "yellow";
  $color{ "apple" } = "red";
  $color{ "lime" } = "green";
  my @sliced_fruit_colors = @color{ "lime","banana" };
  # @sliced_fruit_colors is now ("green","yellow")



STATEMENTS

* A statement is an expression to be evaluated.
* Each statement must end in a semi-colon (;) unless it is the 
  final statement in the block.
* Blocks are groups of statements with a specific purpose and are 
  identified by curly braces({}).
* Remember the rule of TIMTOWTDI: 
  There's More Than One Way To Do It.



CONDITIONAL STATEMENTS

* The if-then-else works as in other programming languages
  if(expression) {
    # block
  } else
    # block
  }

* The if-then can take many forms.  The following are all equivalent:

  if($a==$b) {print "Match";}
  print "Match" if ($a==$b);
  print "Match" unless ($a!=$b);



WHILE LOOPS

* The while() loop executes a loop until the control expression returns false.

  while (cat_is_away()) {
       mice_will_play();
  }



FOR LOOPS: OVER A LIST

* The most common form of for() loop is to iterate over a list.  This
  type is often written as foreach(), although it is the same as for().
  Each iteration through the loop assigns the value to $_.

  my @stooges = qw(Moe Larry Curly);
  foreach (@stooges) {
    # do something...
  }

* You can also assign an explicit loop variable

  my @sons = qw(Chip Robbie Mike);
  foreach my $son (@sons) {
    # do something...
  }



FOR LOOPS: C-STYLE

* C-style for() loops have three separate sections, separated by semicolons.

  for(initialization;terminal condition;iteration) {
     # do something...
  }


* The standard use is to count something:

  for(my $i=1;$i<=10;$i++) {
     print "$i";
  }
     # prints "1 2 3 4 5 6 7 8 9 10"



* However, iterating over a list is easier to write and, unless 
  you're dealing with very large lists, just as quick to execute:

  for my $i(1..10) {
     print "$i";
  }
     # prints "1 2 3 4 5 6 7 8 9 10"



SUBROUTINES

* Declare a subroutine with sub keyword:

  sub factorial {
    my ($n_ = @_;
    my $factorial = 1;
    my $i = 0;
    if ($i>0) {
        for($i=1;$i<=$n;$i++) {
        }
    }
    return($factorial);
  }



CALL A SUBROUTINE

  $result = &factorial(10);

* If a subroutine name doesn't conflict with a Perl reserved 
  keyword, the & is not necessary:

  $result = factorial(10);

* Arguments to a subroutine are passed in the @_ special variable.

* If there is no returen in a subroutine, then the return value 
  is the value   of the last expression evaluated in the 
  subroutine.  However, it's always a good idea to explicitly 
  use return.

* Functions can return lists

  sub foo 
     {
     ...
     return($d,$e,$f);
     }
  ($x,$y,$z_ = &foo();


SUBROUTINE PROTOTYPES

* Subroutine prototypes are used to check that subroutines 
  are called with the right number and order of arguments:

  $     scalar
  @     list
  %     hash
  &     subroutine reference
  *     typeglob


  sub foo ($);     # &foo ($arg1);
  sub foo ($$);    # &foo ($arg1,$arg2);
  sub foo ($$,$);  # &foo ($arg1,$arg2);
                   # or &foo ($arg1,$arg2,$arg3);
                             (optional argument)
  sub foo (@)      # &foo (@bar);
  
  sub foo ($$) {
      my ($foo,$bar) = @_;
      }

* Inline functions are accomplished when a subroutine is defined 
  with an empty prototype:

  sub e() {exp(1);}  #Euler's number


REFERENCES

Symbolic References

* If a Perl scalar variable $foo contains a string that is the name 
  of another variable, then $$foo is a symbolic reference and will 
  be assigned the value of the variable named by the string. 

  my $gardenia = "yellow";
  my $flower = "gardenia";
  my $foo = $$flower;  #$foo now contains the value "yellow"


  my $foo = "bar";
  $$foo = 1001;  #$bar is created and now contains 1001;


HARD REFERENCES

Declare a scalar variable               my $color = "blue";

Create reference to $color              my $rcolor = \$color;

Dereference the reference               print $$rcolor."\n";  #prints "blue"

Create reference to a scalar variable   $ref = \$foo;

...to an array variable                 $ref = \@bar;

...to a hash                            $ref = \%baz;

...to a typeglob                        $ref = \*gack;

...to a subroutine                      sub foo {...}
                                        ref = \&foo;

...to a number                          $euler = \2.718281828;

...to a string                          $BLUE = \"blue";

...to an anonymous array                $ref = [1,2,3,];

...to an anonymous hash                 $ref = {"foo"=>"bar","baz",=>"egk"}

...to an anonymous subtroutine          $ref = sub {...};


* A hard reference is a Perl scalar that points to other 
Perl data, subroutine or typeglob.  Hard references are 
preferred over symbolic references:

  use strict 'refs';  # turn off symbolic references
  no strict 'refs';   # turn off symbolic references within a block


* Use the \ (backslash) operator to create a hard reference.

* Since any scalar can have a reference taken, and 
references are scalars, you can take a reference 
to a reference.

  $foo  = "bar";    # scalar variable
  $rfoo  = \$foo;   # reference to a scalar
  $$rrfoo = \$rfoo; # reference to a reference to a scalar


DEREFERENCING

  $$ref;            # dereference a scalar reference
  &$sub($x,$y);     # dereference a subroutine reference
  $$$ref            # dereference a reference to a reference

* Use the arrow operator as an alternate way to dereference a hash or array    reference:

  my @foo = (1,2,3,4,5);
  my $aref = \@foo;
  print $$aref[3];  # pritns 4
  print $aref->[3]; # also prints 4

  my %foo = ("flower"=>"edelweiss","rock"=>"pegmatite");
  my $href = \%foo;
  print $$href{'flower'};  prints edelweiss
  print $href->{'flower'}; also prints edelweiss


REFERENCE EXAMPLES

  my $foo = "testing";  # scalar variable
  my $rfoo = \$foo;     # reference to scalar variable
  my $rrfoo = \$rfoo    # reference to a reference
  print($foo,"\n");     # prints "testing"  
  print($rfoo,"\n");    # prints "SCALAR(0x80cf7a4)"
  print($rrfoo,"\n");   # prints "SCALAR(0x80cf7bc)"
  print($$rrfoo,"\n");  # prints "SCALAR(0x80cf7a4)"
  print($$$rrfoo,"\n"); # prints "testing"
  print($$rfoo,"\n");   # prints "testing"

* Note that the balues of the memory location in the examples above 
  (SCALAR(0x80cf7a4), etc) will vary at run-time.


FORMATS

Perl formats make it easy to create simple, formatted reports.

Format definition:

format name = 
FORMATLIST

Format specs:

@         fieldholder
@<<<<     left-justified field
@>>>>     right-justified field
@||||     centered field
@####.##  fixed precision numberic field
@*        field with many lines
^<<<<     filled field
~         comment line
~~        formats can contain
#         comments

EXAMPLES

format STDOUT = 
Howdy! My name is @<<<<<<<<<<<and I'm @|| years old.
$name,$age

$name = "Tony";
$age = 23;

write format;


STANDARD FILE HANDLES

* By convention, all filehandles and dirhandles are capitalized.

HANDLE        PURPOSE

STDIN         Standard Input
STDOUT        Standard Output
STDERR        Standard Error
ARGV          Argument passed on the script's command line
DATA          Data at end of script (after the __END__ or __DATA__ token)


* The DATA file is basically a fake text file at the end of the program.
  The file DATA is opened automatically.

1: #!/usr/bin/perl -w
2: while (<DATA>) {
3:        chomp;# strip trailing "\n";
4:        push( @friends,$_);
5: }
6: __DATA__
7: Pooky
8: Bunny
9: Chimpy

* In this example, lines 7-9 are effectively a 3-line text file that is read 
  by the while() loop in lines 2-5.  At the end of the program, @friends will 
  be ("Pooky","Bunny","Chimpy").


TOKENS

* The following tokens are expanded.  They are helpful for 
  debugging.  Note that they are not interpolated in strings.

  print "Problem at line",__LINE__,"in",__FILE__,"\n";
  # prints "Problem at line 1 in foo.pl"
  print "Problem at line__LINE__in__FILE__\n"
  # prints "Problem at line__LINE__in__FILE__"

 TOKEN         MEANING

__LINE__       Current line number

__FILE__       Current file name

__END__        End of script

__DATA__       End of script and open DATA filehandle

__DIE__        $SIG{__DIE__}

__WARN__       $SIG{__WARN__}

__PACKAGE__    Current package name



MATCHING & TRANSLATION

* Use regular expressions to find and extract patterns and also to change
  text.  General syntax:

  $foo=~/pattern/modifiers;                # search for pattern
  $foo=~s/pattern/replacement/modifiers;   # search and replace
  $foo=~ tr/range/replacement/modifiers;   # translation

* The tables below are simplified for non-pathological 
cases, and they are by   no means exhaustive.  For 
discussion of the nitty-gritty of regular expressions, 
refer to perldoc perl.

REGULAR EXPRESSION METACHARACTERS

CHARACTER     MEANS                         EXAMPLE

\             Escape: removes the           ^$/ matches the character "$"
              spcial meaning of the         not the end of the string
              following character


|             Alternation: This or           /a|b/ matches "a" or "b"
              that 


(_)           Grouping: Treats ... as       "underdog"=~ /(cat|dog/ returns 
              a single entity, and assigns   1 (true), and $1 contains "dog"
              the results of the unit to $n


[...]         Character class                /[aeiou]/ matches any one of 
                                             the vowels.


^             Beginning of the string        


.             Any single character, except newline 
              (unless the /s modifier is used)
              Example:  /d.g/ matches "dog" and 
              "dig", but not "ding"
  


REGULAR EXPRESSIONS QUANTITY MODIFIERS

CHARACTER          MEANS

*                  Match 0 or more times

+                  Match 1 or more times

?                  Match exactly 0 or 1 time

{N}                Match exactly N times

{MIN,}             Match at least MIN times

{MIN,MAX}          Match between MIN and MAX times inclusive


* All the above quantity modifiers are "greedy," meaning that they match the 
  largest possible string.

* All may be modified with the "non-greedy" modifier to specify minimal
  matching and to match the shortest possible string.  Consider the 
  following:

  my $str = "Alakazam!";
  $str =~/(A.+a)/;    # $1 is "Alakaza"
  $str =~/(A.+?a)/;   # $1 is "Ala"


REGULAR EXPRESSION MODIFIERS

MODIFIER      USED BY        MEANING

/i            m//, s///      Ignore case

/s            m//, s///      Let . match newline

/m            m//, s///      Let ^ and $ match next to \n

/x            m//, s///      Ignore whitespace; allows comments

/o            m//, s///      Compile the pattern once (only useful for
                             expressions with variables in the pattern)

/g            m//            Find all matches in the string (Global).  
                             Implicitly sets up a list context. @

/g            s///           Replaces all occurrences

/cg           m//            Allow continued searches after failed /g match

/e            s///           Evaluate the replacement string as an expression



REGULAR EXPRESSION MODIFIERS

SYMBOL     OPPOSITE     MEANS

\()                     null character

\nnn                    Octal character "nnn"

\1-9                    nth captured string

\a                      Alarm character

\A                      Beginning of string

\b         \B           Word boundary 

\cX                     Control character Ctrl-X

\d         \D           Digit

\e                      ESC character

\E                      Ends \L, \U. or \Q translation

\l                      Lowercase following characters until \E

\n                      Newline character

\Q                      Quotes following characters

\r                      Return character

\s         \S           Whitespace character

\t                      Tab character  

\u                      Uppercase next character

\U                      Uppercase following characters unitl \E

\w         \W           Any word character (a-z, 0-9, and _) (or not)

\zhh                    Hex character hh (i.e. \x20 is space character)

\z                      End of string

\Z                      End of string before newline


REGEX SPECIAL VARIABLES

* After a regular expression has been called, certain special variables
  will have their value set.

* The most common will be the numberic variables $1, $2, etc.  Each 
  corresponds to the subexpression matched in the last regular expresion 
  match.

  $name = "Mr. Larry Wall";  
  $name =~/^(Mr\.|Mrs\.)(\w+)(\w+)$/;
  # $1 = "Mr.", $2 = "Larry", $3 = "Wall"


VARIABLE       ENGLISH                  MEANS

$1-9                                    nth group matched

$+             $LAST_PAREN_MATCH        Last parenthesized submatch

$&             $MATCH                   String matched

$`             $PREMATCH                String preceding $&

$'             $POSTMATCH               Sting following $&



OPERATORS

MATHEMATICAL OPERATORS

OPERATOR   DOES                EXAMPLE

+          Addition            5+7 = 12

-          Subtraction         5150-2112 = 3038

*          Multiplication      4*5 = 20

/          Division            8/3 = 2.66666
                               unless "use integer" is on,in which case 8/3 = 2 

**         Exponentiation      2**3 = 8
           (raise to a power)        

%          Modulo (remainder)  7%4 = 3


COMPARISION OPERATORS

* Perl has two sets of comparison operators, depending on whether you're 
  comparing strings or numbers.

* The distinction is important, because "5" gt "20", but 5 < 20.


NUMERIC     STRING     MEANS

>           gt         Greater than

<           lt         Less than

>=          ge         Greater than or equal to

<=          le         Less than or equal to 

==          eq         Equal to 

!=          ne         Not equal to 

<=>         cmp        Comparision (returns -1, 0 or 1)


LOGICAL OPERATORS (return true or false) STRING

&&          logical AND
         
||          logical OR
             
!           unary negation

?:          conditional (if ? then : else)

not         logical NOT (low precedence)

and         logical AND (low precedence)

or          logical OR  (low precedence)

xor         logical exclusive OR (low precedence)

ASSIGNMENT OPERATORS

* Perl allows all of the following shortcut operators.

=      +=     -=     *=     /=

%=     &=     <<=    >>=    ~=

|=     ^=     &&=    ||=

* For example, the following statements are equivalent.

  $a = $ + 20;
  $a += 20;

* More usefully, the following are also equivalent;

  $big_array[$big_expression] = $big_long_array[$big_expression] + $b;
  $big_array[$big_expression] += $b;


BITWISE OPERATORS

<<   bitwise shift left
>>   bitwise shift right
&    bitwise AND
~    bitwise complement
|    bitwise OR
^    bitwise exclusive OR



PATTERN MATCHING OPERATORS

=~   pattern match
!~   not pattern match



STRING OPERATORS

x    repetition
.    concatenation
..   Range Operator, or Enumeration


* The x operator make it easy to create long strings, or 
  repeat common text.  The following are equivalent:

  my $exclamation = "Marcia!, Marcia!, Marcia!";
  my $exclamation = "Marcia!" x3;



LIST OPERATORS

x   Repetition
,   Comma Operator/List Seperator
=>  Same as Comma Operator, usually
..  Range Operator, or Enumeration



REFERENCE OPERATORS

ref  returns the type of a reference
\    reference
->   dereference
@$   dereference array
%$   dereference hash
&$   dereference function
*$   dereference typelog



<> ANGLE BRACKET OPERATORS

open (FILE,"<foo.txt");
while(<FILE>)
{print;}
close(FILE);
open(FILE,"<foo.txt");
my @foo = <FILE>;  # read in entire file into an array
close (FILE);



FILE OPERATORS

OPERATOR     MEANS

-e           Exists
-z           Has zero size
-s           Size of the file
-f           Is a plain file
-d           Is a directory
-T-B         Is a text/binary file
-r-w-x-0     Is readable/writeable/executable/owned by effective UID/GID
-R-W-X-O     Is readable/writeable/executable/owned by real UID/GID
-u-g-k       Has setuid/setgid/sticky bit set
-l-p-s       Is a symbolic link / named pipe / socket
-b-c         Is a block/character special file
-t           Is opened to a tty
-M-A-C       Age of file (at starup) in days since modification 
             / last access / inode change


* The file test operators work as they do in the Unix Bourne shell.
  
  # Does the file exist at all?
  if(not -e $filename){
     print "$filename does not exist\n";
  
  # Is the file actually a directory?
  } elsif (-d $filename){
      print "$filename is a directory\n";

  # It must be a file.  Get its size and print it 
  } else {
      my $size = -s $filename;
      print "$filename is a file and is $size bytes long\n";
  }


POD (PLAIN OLD DOCUMENT)

* POD is a simple markup language to insert documentation into a Perl script.

* POD directives begin with an equal sign (=).  Any line beginning with = 
  starts a POD section which continues until the =cut line.  

* Use pod to comment out large swaths of code;
 
  =pod
  lots of Perl code here
  =cut

POD DIRECTIVES

DIRECTIVE          EFFECT

=pod               Start of POD
=cut               End of POD
=head1 heading     Level 1 heading
=head2 heading     Level 2 heading
=item n            Item in numbered list
=item *            Item in bulleted list
=item B            <NOTE>
=over n            Indent over n spaces
=back              Unindent (opposite of over)
=begin x           Bracket begin and end of format x
=end x             Bracket begin and end of format x
=for x             Next paragraph is format x


POD TAGS

* POD tags work sort of like HTML tage, allowing changes to how text is 
  displayed.

TAG            MEANING

B<text>        Bold text
C<code>        Literal source code
E<gt>          Greater than sign
E<lt>          Less than sign
E<html>        HTML code
E<n>           Character escape
F<file>        File name
I<text>        Italics
L<name>        Link or cross reference
L<name/ident>  Item in man page

L<name/"sec">       
L<"sec">       Section in man page
L</"sec">

S<text>        Text contains non-breaking spaces
X<index>       Index entry
Z<>            Zero-width character



POD UTILITIES

* Included with the Perl distribution, can be invoked from the command line.

PROGRAM          DOES

pod2fm           Converts pod to FrameMaker   
pod2heml         Converts pod to HTML
pod2latex        Converts pod to LaTeX
pod2man          Converts pod to man page format
pod2text         Converts pod to plain text file



PERL DEBUGGER

DEBUGGER COMMAND           DOES

p expr                     print expr
I [range]                  list a range of lines
w [line]                   list window of lines around a specified line
-                          list previous window
.                          return to executed line
f file                     switch to file and start listing it
l sub                      list named subroutine
S[!]pattern                list subroutines [not] matching pattern
/pattern/                  search forward for pattern
?pattern?                  search backward for pattern
b [line[condition]]        set breakpoint at line
b sub [condition]          set breakpoint at named sub
d [line]                   delete breakpoint at given line
D                          delete all breakpoints
L                          list breakpoints or action lines
a [line] command           set action for a line
A                          delete all line actions
<command                   set an action to execute before each debug prompt
>command                   set an action to execute after each debug prompt  
V [package[pattern]]       list variable matching pattern in package
X [pattern]                list variable matching pattern in currnet package
! [[-]number]              re-execute command 
!                          re-execute last command                 
! pattern                  re-execute last command that begins with pattern
!! [command]               run command as subprocess
H [-number]                display last number commands
| command                  run degugger command through the current pager
|| command                 run degugger command through the current pager,
                           select DB::OUT also
t                          toggle trace mode
t expr                     trace through exectuion of expr
x expr                     evaluate expr in llist sontext, print results 
                           (prints complex data sturctures)
O [opt[=val]]              sets or queries values of debugger options
= [alias value]            set or list current aliases
R                          restart debugger
q                          quit debugger


SPECIAL VARIABLES

* For a while, the special variables had to be used with just their terse, 
  punctuation-heavy forms.  Then came the English module, which lets you use 
  their equivalents in the English column just by putting "useEnglish;" at
  the top of your program.

* Note that this can be a performance hit on regular expression-heavy code.

* This list is not all inclusive.  Specifically, the ones for format lines
  and process IDs are excluded.


VARIABLE    ENGLISH        MEANS

$_          $ARG           The Perl Pronoun: the defualt input and 
                           pattern search space.

@           @ARG           Argument list passed to the subroutine

ARGV                       Special handle for iterating over command-line 
                           files
$ARGV                      Contains the name of the file when reading 
                           from ARGV                                    

@ARGV                      Contains command-line arguments for the script

$^T         $BASETIME      Time at which the script started executing

$?          $CHILD_ERROR   Status from last system call

$^D         $DEBUGGING     Current value of the internal debugging flags

$)          $EFFECTIVE_    Effective group ID of the process
            GROUP_ID

$(          $EFFECTIVE_    Effective user ID of the process
            USER_ID         

$ENV                       All current environment variables

$@          $EVAL_ERROR    Error from the last eval

$^X         EXECUTABLE_    Name fo the Perl binary (useful for error 
            NAME           messages)

@INC                       List of directories where Perl searches for 
                           modules

$.          $NR or         Current record number for the last file read 
            $INPUT_LINE_
            NUMBER

$/          $RS or         Input record separator. By default, it's "\n"
            $INPUT_RECORD_ This may not be a patten. There are certain
            SEPERATOR      special values:
                           * undef N> Causes next file read to slurp the 
                           entire file reference to an integer
                           * n N> Fixed-length reads of n bytes
                           * ""N> Separator is consecutive blank lines

S"          $LIST_         String to put between elements of a list when 
            SEPARATOR      displayed as a string

$^O         $OSNAME        Name of the platform

$,          $OFS or        String to print between list elements in a print
            $OUTPUT_       statement
            FIELD_
            SEPARATOR

$\          $ORS or        Similar to $, but between records,
            $OUTPUT_       not to list elements
            RECORD_
            SEPARATOR
            

$^V         $PERL_         Revision, version and subversion of Perl
            VERSION

$O          $PROGRAM_      Filename of the current script
            NAME

%SIG                       Hash of signal handlers

$^W         $WARNING       State of warnings. Can be set to turn on/off 
                           warnings for a section of code


PRINT FORMATTING

FIELD     MEANS

%%        Percent sign

%c        Character

%s        String

%d%u      Signed/unsigned integer in decimal

%o%x      Unsigned integer in octal/hex

%e        Floating-point number in scientific notation

%f        Floating-point number in fixed decimal notation

%g        Floating-point number, either %e or %f

%x        Like %x, but with uppercase letters

%E/%G     Like %e/%g, but with uppercase "E"

%b        Unsigned integer in binary

%p        Pointer in hex

%n        Stores the number of characters output so far


PACK/UNPACK FORMATS

Pack() and unpack() manipulate data records much more quickly than 
repeated calls to the substr()

CHARACTER    MEAN

a/A          Null-padded/space-padded string of bytes

b/B          Bit string, in ascending/descending bit order

c/C          Signed/unsigned 8-bit character value

d/f          Double/single-precision float in native format

h/H          Hex string, low/high nybble first

i/I          Signed/unsigned integer, native format

l/L          Signed/unsigned long, always 32 bits

n/v          16-bit short in big/little-ending order

p/P          Pointer to a null-terminated/ fixed-length string

q/Q          Signed/unsigned quad 64-bit integer

u            unencoded string

U            Unicode character number

x/X          Skip / back up a byte

Z            Null-terminated and null-padded string

@            Null-fill to absolute position


ENVIRONMENT VARIABLES

Perl uses many variables from your environment:

VARIABLE      MEANS

HOME          Used if chdir is called without an argument

LOGDIR        Same as HOME if HOME isn't defined

PATH          Path to search for the program if the -s command line 
              switch is used: also for executing subprocesses

PERL5LIB      List of directories to search for Perl modules

PERLLIB       Used if PERL5LIB is not defined

PERL5OPT      Default command-line switches

PERL5DB       Command used to load the debugger


PRAGMAS

* This is not an exhaustive list of pragmas, but the most 
  commonly used and useful.

* Note that most pragmas can be turned off with "no" 
  as in "no integer."


PRAGMA                 DOES

use constant           Defines the symbol to be an unchangeable scalar or
                       list.  Note that you can only define one symbol per
                       use of "use constant".

use diagnostics        Expands error messages.  Helpful for debugging if the 
                       short Perl error messages are too terse.

use integer            Use integer math instead of the default loating-point

use lib                Adds directories to Perl's search path

use lib "./libs"       Adds the directory "./libs" to Perl's search path

no lib "./libs"        Removes "./libs" from Perl's search path

use strict             Changes what Perl considers to be legal.  Helps avoid
                       shooting yourself in the foot from typos. Without 
                       "vars", "refs" or "subs", Perl assumes you want all
                       three.

use strict "vars"      Forces all variables to be declared before use.

use strict "refs"      Can't use symbolic references.

use strict "subs"      Treats all barewords as a syntax error

use subs               Delclares subroutines names 

Site created and maintained by:
Bill Jones wcjones at fccj dot org
Page format last reviewed: April 13th, 2002
Copyright © 2002 WC -Sx- Jones, All Rights Reserved.
...

...