Perl notes
Introduction
Dating back to 1987, Perl is a high-level, general-purpose, interpreted, dynamic programming language. The Perl language borrows features from C, shell script, AWK, and sed.
Advantages
- Supported on most platforms
- Automatic memory management
- Regular expressions are integrated in the laguage
- A scripting language that requires no development tools
- Built in debugger
Disadvantages
- Very weak type checking (only 4 types available)
- Use of "magic" variables with that lack meaningful names
- Very poor number crunching performance because numbers are held in string format
- Has a reputation for compact and cryptic code
Variable Types
Perl has only four type: - scalar, array, hash and a reference to any of these three. This very limited range of types results in very poor type checking.
scalar - prefix $
The scalar type holds:- integers
- floating point numbers
- characters
- strings
- parameters received by subroutines
$number=100; $character='c'; $string="fred";
The fact that a scalar can hold a string or a number leads to different comparison operators to indicate how the contents of a scalar should be interpreted.
if ($number > 0) if ($number == 0) if ($number < 0) if ($string gt "fred") if ($string eq "fred") if ($string lt "fred")
array - prefix @
The array type performs the same function as in other langauges, but the syntax is unusual.
@arrDays = ('Mon','Tue','Wed','Thu','Fri','Sat','Sun') $day = $arrDays[0] # not @arrDays[0] as you might expect
hash - prefix #
The hash type is a set of key/values associative array like a dictionary in other languages.
reference
The reference type is akin to a pointer in C and needs to be dereferenced when used.
Subroutines
sub func1($$) { $parm1 = $_[0]; $parm1 = $_[1]; # Perl statements return 0; }
Special Variables
One aspect of Perl that contributes to is reputation for crpytic code is the use of special variables that have no meaningful names. There are meaninfful alternatives to many of these but they are less often used. To use these you must include:
use English;
__LINE__ | The current line number within the current file |
__FILE__ | The name of the current file |
__PACKAGE__ | The name of the current package |
__END__ | Indicates the end of the script |
__DATA__ | Like __END__ except it also indicates the start of the DATA filehandle that can be opened with the open, therefore allowing you to embed script and data into the same script |
_ | Represents the special filehandle used to cache information from the last successful stat, lstart, or file test operator |
$0 ($PROGRAM_NAME) | The name of the file containing the script currently being executed |
$1..$xx | The numbered variables $1,$2 and so on used to hold the contents of group matches both inside and outside of regular expressions |
$_ ($ARG) | Represent the default input and pattern searching spaces. For many functions and operations, if no specific varibale is specified, the default input space will be used |
$& ($MATCH) | The string matched by the last successful pattern match |
$` ($PREMATCH) | The string precedig the information macthed by the last pattern match |
$' ($POSTMATCH) | The string following the information matched by the last pattern match |
$+ ($LAST_PARENT_MATCH) | The last bracket match by the last regular expression search pattern |
$* | Set to 1 to do multiline pattern macthing within a string. The default value is 0. This has been superseded by the /s and /m modifiers to regular expressions |
@+ (@LAST_MATCHED) | Contains a list of all the offsets of the last successful submatches from the last regular expression. Note that this contains the offset to the first character following the match, not the location of the match itself. This is the equivalent of the value returned by the pos function. The first index, $+[0] is the offset to the end of the entire match. Therefore, $+[1] is the location where $1 ends, $+[2], where $2 ends. |
@- (@LAST_MATCH_START) | Contains a list of all the offsets to the beginning of the last successful submatches from the last regular expression. The first index, $-[0], is offset to the start of the entire match. Terefore, $-[1] is equal to $1, $-[2] is equal to $2, and so on |
$. ($NR) ($INPUT_LINE_NUMBER) | The current input line number of the last file from which you read. |
$/ ($RS) ($INPUT_RECORD_SEPARATOR) | The current input record separator. This is newline by default |
@ISA | The array that contains a list of other packages to look through whena method call on an object cannot be found within the current package. The @ISA array is used as a list of base classes for the current package |
$¦ ($AUTOFLUSH) ($OUTPUT_AUTOFLUSH) (autoflush HANDLE EXPR) | By default $¦ is set 0 and output is buffered. Setting $¦ to non-zero the filehandle(current, or specified> will be automatically flushed after each write operation. |
$, ($OFS) ($OUTPUT_FIELD_SPARATOR) | The default output separator for the print series of functions. By default, prin outputs the comma-separated fields you specify without any delimiter. |
$\ ($ORS) ($OUTPUT_RECORD_SEPARATOR) | The default record separator |