Perl logo

Perl notes

Introduction

Dating back to 1987, Perl is a high-level, general-purpose, interpreted, dynamic programming language. The Perl language borrows features from C, shell script, AWK, and sed.

Advantages

Supported on most platforms
Automatic memory management
Regular expressions are integrated in the laguage
A scripting language that requires no development tools
Built in debugger

Disadvantages

Very weak type checking (only 4 types available)
Use of "magic" variables with that lack meaningful names
Very poor number crunching performance because numbers are held in string format
Has a reputation for compact and cryptic code

Variable Types

Perl has only four type: - scalar, array, hash and a reference to any of these three. This very limited range of types results in very poor type checking.

scalar - prefix $

The scalar type holds:

integers
floating point numbers
characters
strings
parameters received by subroutines

$number=100; 
$character='c';
$string="fred";

The fact that a scalar can hold a string or a number leads to different comparison operators to indicate how the contents of a scalar should be interpreted.

if ($number > 0)
if ($number == 0)
if ($number < 0)
if ($string gt "fred")
if ($string eq "fred")
if ($string lt "fred")

array - prefix @

The array type performs the same function as in other langauges, but the syntax is unusual.

        @arrDays = ('Mon','Tue','Wed','Thu','Fri','Sat','Sun')
        $day = $arrDays[0] # not @arrDays[0] as you might expect

hash - prefix #

The hash type is a set of key/values associative array like a dictionary in other languages.

reference

The reference type is akin to a pointer in C and needs to be dereferenced when used.

Subroutines

sub func1($$)
{
    $parm1 = $_[0];
    $parm1 = $_[1];
    
    # Perl statements
    
    return 0;
}

Special Variables

One aspect of Perl that contributes to is reputation for crpytic code is the use of special variables that have no meaningful names. There are meaninfful alternatives to many of these but they are less often used. To use these you must include:

        use English;

__LINE__	The current line number within the current file
__FILE__	The name of the current file
__PACKAGE__	The name of the current package
__END__	Indicates the end of the script
__DATA__	Like __END__ except it also indicates the start of the DATA filehandle that can be opened with the open, therefore allowing you to embed script and data into the same script
_	Represents the special filehandle used to cache information from the last successful stat, lstart, or file test operator
$0 ($PROGRAM_NAME)	The name of the file containing the script currently being executed
$1..$xx	The numbered variables $1,$2 and so on used to hold the contents of group matches both inside and outside of regular expressions
$_ ($ARG)	Represent the default input and pattern searching spaces. For many functions and operations, if no specific varibale is specified, the default input space will be used
$& ($MATCH)	The string matched by the last successful pattern match
$` ($PREMATCH)	The string precedig the information macthed by the last pattern match
$' ($POSTMATCH)	The string following the information matched by the last pattern match
$+ ($LAST_PARENT_MATCH)	The last bracket match by the last regular expression search pattern
$*	Set to 1 to do multiline pattern macthing within a string. The default value is 0. This has been superseded by the /s and /m modifiers to regular expressions
@+ (@LAST_MATCHED)	Contains a list of all the offsets of the last successful submatches from the last regular expression. Note that this contains the offset to the first character following the match, not the location of the match itself. This is the equivalent of the value returned by the pos function. The first index, $+[0] is the offset to the end of the entire match. Therefore, $+[1] is the location where $1 ends, $+[2], where $2 ends.
@- (@LAST_MATCH_START)	Contains a list of all the offsets to the beginning of the last successful submatches from the last regular expression. The first index, $-[0], is offset to the start of the entire match. Terefore, $-[1] is equal to $1, $-[2] is equal to $2, and so on
$. ($NR) ($INPUT_LINE_NUMBER)	The current input line number of the last file from which you read.
$/ ($RS) ($INPUT_RECORD_SEPARATOR)	The current input record separator. This is newline by default
@ISA	The array that contains a list of other packages to look through whena method call on an object cannot be found within the current package. The @ISA array is used as a list of base classes for the current package
$¦ ($AUTOFLUSH) ($OUTPUT_AUTOFLUSH) (autoflush HANDLE EXPR)	By default $¦ is set 0 and output is buffered. Setting $¦ to non-zero the filehandle(current, or specified> will be automatically flushed after each write operation.
$, ($OFS) ($OUTPUT_FIELD_SPARATOR)	The default output separator for the print series of functions. By default, prin outputs the comma-separated fields you specify without any delimiter.
$\ ($ORS) ($OUTPUT_RECORD_SEPARATOR)	The default record separator

Home page