Académique Documents
Professionnel Documents
Culture Documents
1 Introduction to Perl . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.1 A Scientific Hello World Script . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
1.1.1 Reading and Writing Data Files . . . . . . . . . . . . . . . . . . . . . 3
1.1.2 The Complete Code . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
1.1.3 Dissection . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
1.1.4 The Concept of Context in Perl . . . . . . . . . . . . . . . . . . . . . 7
1.2 Automating Simulation and Visualization . . . . . . . . . . . . . . . . . . . 8
1.2.1 The Complete Code . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
1.2.2 Dissection . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
1.3 Theres More Than One Way To Do It . . . . . . . . . . . . . . . . . . . . . . 12
1.3.1 A Script for Perl Beginners . . . . . . . . . . . . . . . . . . . . . . . . . 13
1.3.2 Using the Underscore Variable . . . . . . . . . . . . . . . . . . . . . . . 14
1.3.3 A Script Written in Typical Perl Style . . . . . . . . . . . . . . . . 14
1.3.4 Shorter Scripts for Lazy Programmers . . . . . . . . . . . . . . . . 15
1.3.5 The Ultimate Goal: Getting Rid of the Script File . . . . . 15
1.3.6 Perl Has a Grep Function Too . . . . . . . . . . . . . . . . . . . . . . . 15
1.4 Frequently Encountered Tasks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
1.4.1 Basic Control Statements . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
1.4.2 File Reading and Writing . . . . . . . . . . . . . . . . . . . . . . . . . . . 18
1.4.3 Running an Application . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19
1.4.4 One-Line Perl Scripts . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20
1.4.5 Array and List Operations . . . . . . . . . . . . . . . . . . . . . . . . . . 21
1.4.6 Hash Operations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24
1.4.7 Splitting and Joining Text . . . . . . . . . . . . . . . . . . . . . . . . . . 25
1.4.8 Text Processing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26
1.4.9 String Operations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26
1.4.10 Environment Variables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27
1.4.11 Subroutines . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28
1.4.12 Nested, Heterogeneous Data Structures . . . . . . . . . . . . . . . 32
1.4.13 Testing a Variables Type . . . . . . . . . . . . . . . . . . . . . . . . . . . 33
1.4.14 Numerical Expressions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34
1.4.15 Listing of Files in a Directory . . . . . . . . . . . . . . . . . . . . . . . 34
1.4.16 Testing File Types . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34
1.4.17 Copying and Renaming Files . . . . . . . . . . . . . . . . . . . . . . . . 35
1.4.18 Creating and Moving to Directories . . . . . . . . . . . . . . . . . . 36
1.4.19 Removing Files and Directories . . . . . . . . . . . . . . . . . . . . . . 36
1.4.20 Splitting Pathnames . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36
1.4.21 Traversing Directory Trees . . . . . . . . . . . . . . . . . . . . . . . . . . 37
1.4.22 Downloading Internet Files . . . . . . . . . . . . . . . . . . . . . . . . . . 38
1.4.23 CPU-Time Measurements . . . . . . . . . . . . . . . . . . . . . . . . . . . 39
1.4.24 Programming with Classes . . . . . . . . . . . . . . . . . . . . . . . . . . 40
1.4.25 Debugging Perl Scripts . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40
1.4.26 Regular Expressions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42
1.4.27 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46
1.4.28 Building and Using Modules . . . . . . . . . . . . . . . . . . . . . . . . 49
1.4.29 Binary Input/Output . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51
1.5 Installing Perl and Additional Modules . . . . . . . . . . . . . . . . . . . . . . 52
1.5.1 Installing Basic Perl . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52
1.5.2 Manual Installation of Perl Modules . . . . . . . . . . . . . . . . . . 52
1.5.3 Automatic Installation of Perl Modules . . . . . . . . . . . . . . . 53
1.5.4 The Required Perl Modules . . . . . . . . . . . . . . . . . . . . . . . . . 54
1.6 Perl Versus Python . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54
1.6.1 Pythons Advantages . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54
1.6.2 Perls Advantages . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56
1.6.3 Efficiency . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57
1.7 GUI Programming with Perl/Tk . . . . . . . . . . . . . . . . . . . . . . . . . . . 59
1.7.1 The First Perl/Tk Encounter . . . . . . . . . . . . . . . . . . . . . . . . 60
1.7.2 The Similarity of Python/Tkinter and Perl/Tk . . . . . . . . 62
1.7.3 Binding Events . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62
1.8 Web Interfaces and CGI Programming . . . . . . . . . . . . . . . . . . . . . . 63
1.8.1 Web Versions of the Scientific Hello World Program . . . . 63
1.8.2 Debugging CGI Scripts in Perl with CGI::Debug . . . . . . . 65
1.8.3 Using Perls CGI Module to Construct Forms . . . . . . . . . 67
2 Introduction to Tcl/Tk . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 70
2.1 A Scientific Hello World Script . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 71
2.1.1 The Complete Code . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 71
2.1.2 Dissection . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 71
2.2 Reading and Writing Data Files . . . . . . . . . . . . . . . . . . . . . . . . . . . . 72
2.2.1 The Complete Code . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 73
2.2.2 Dissection . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 73
2.2.3 Double Quotes, Braces, Brackets, and Variable Substi-
tution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 76
2.3 Automating Simulation and Visualization . . . . . . . . . . . . . . . . . . . 77
2.3.1 The Complete Code . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 77
2.3.2 Dissection . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 79
2.4 Frequently Encountered Tasks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 81
2.4.1 File Reading and Writing . . . . . . . . . . . . . . . . . . . . . . . . . . . 82
2.4.2 Running an Application . . . . . . . . . . . . . . . . . . . . . . . . . . . . 82
2.4.3 List Operations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 83
2.4.4 Associative Array Operations . . . . . . . . . . . . . . . . . . . . . . . 84
2.4.5 Splitting and Joining Text . . . . . . . . . . . . . . . . . . . . . . . . . . 85
2.4.6 Text Processing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 85
2.4.7 String Operations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 86
2.4.8 Numerical Expressions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 87
2.4.9 Environment Variables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 87
2.4.10 Procedures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 88
2.4.11 Listing of Files in a Directory . . . . . . . . . . . . . . . . . . . . . . . 91
2.4.12 Testing File Types . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 91
2.4.13 Copying and Renaming Files . . . . . . . . . . . . . . . . . . . . . . . . 91
2.4.14 Creating and Moving to Directories . . . . . . . . . . . . . . . . . . 91
2.4.15 Removing Files and Directories . . . . . . . . . . . . . . . . . . . . . . 92
2.4.16 Splitting Pathnames . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 92
2.4.17 Traversing Directory Trees . . . . . . . . . . . . . . . . . . . . . . . . . . 92
2.4.18 CPU-Time Measurements . . . . . . . . . . . . . . . . . . . . . . . . . . . 93
2.4.19 Programming with Classes . . . . . . . . . . . . . . . . . . . . . . . . . . 93
2.4.20 Building and Using Packages . . . . . . . . . . . . . . . . . . . . . . . . 93
2.4.21 Regular Expressions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 95
2.4.22 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 95
2.5 GUI Programming with Tcl/Tk . . . . . . . . . . . . . . . . . . . . . . . . . . . . 96
2.5.1 The First Tcl/Tk Encounter . . . . . . . . . . . . . . . . . . . . . . . . 97
2.5.2 Binding Events . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 98
2.5.3 Widget Name Hierarchy . . . . . . . . . . . . . . . . . . . . . . . . . . . . 99
2.5.4 The Similarity of Python/Tkinter and Tcl/Tk . . . . . . . . . 99
2.5.5 Using Variables in Widget Names . . . . . . . . . . . . . . . . . . . . 100
2.5.6 Configuring Widgets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 101
2.5.7 The Grid Geometry Manager . . . . . . . . . . . . . . . . . . . . . . . . 102
Preface
Introduction to Perl
This chapter gives a quick introduction to the Perl language for readers
who are familiar (at least to some extent) with the Python scripts from
Chapters 2.1A Scientific Hello World Scriptsection.502.3Gluing Stand-Alone
Applicationssection.89 and 3Basic Pythonchapter.135 in the book [?]. We
shall look at the same sample scripts and show how the syntax changes when
we program in Perl.
A Web version of the man pages can be found in the doc.html file. There you
can also find the Perl FAQ and a quick reference.
Having grasped the basic introduction to Perl from this appendix, you
will find the definite Perl reference, the famous Camel book [?], very useful.
However, much of the text in [?] coincides with the Perl man pages. If you feel
that a more comprehensive introduction to Perl is needed, Learning Perl
[?] and [?] are recommended. Ready-made recipes for numerous common
tasks in scripting are collected in the highly recommended Perl Cookbook
[?]. Advanced features of Perl are well discussed in [?] and [?]. Some Web
resources regarding Perl topics are listed in doc.html.
The first Perl encounter consists of three of the examples from the intro-
duction to Python in Chapter 2Getting Started with Python Scriptingchapter.49
in [?]. We start out with a Hello World script, before continuing with a script
concerning file handling and array processing. Thereafter we present a script
gluing a simulation and a visualization program. All these scripts referred to
in this section are found in src/perl. Thereafter, in Chapter 1.4 we list, in an
example-oriented way, some basic and useful Perl functionality for quick refer-
ence. Chapter 1.5 explains how to install Perl and additional modules. A brief
comparison of Perl versus Python appears in Chapter 1.6, while Chapters 1.7
2 1. Introduction to Perl
and 1.8 deal with graphical user interfaces: standard GUIs and dynamic Web
pages, respectively.
Comments in Perl start with # and continue for the rest of the line. However,
the first line #/usr/bin/perl! has a special meaning: Under Unix it tells that
the script, if run as an executable file, is to be interpreted by the program
/usr/bin/perl. If the executable Perl interpreter is stored in another path on
your system, you must write the correct full path in the top line of the script
or (usually better) use a different header to be presented in Chapter 1.1.1.
Scalar variables in Perl are always preceded by a $ sign, i.e., $r and $s are
scalar variables in the present script. The command-line arguments to a Perl
script are automatically stored in the array ARGV. Subscripting this array is
done as in $ARGV[0] (which implies extracting the first entry; arrays in Perl
start with 0 as in C and Python). The length of the array is $#ARGV+1, i.e.,
$ARGV[$#ARGV] is the last entry of the array. The array itself as a variable is
reached with the syntax @ARGV (and one can say, e.g., print "ARGV=@ARGV").
Variables can be directly inserted into a text string, a convenient feature
called variable interpolation:
print "Hello, World! sin($r)=$s\n"; # print to screen
or you can make the file executable under Unix (chmod a+x hw.pl) and then
just write
./hw.pl 0.1
sub myfunc {
my ($y) = @_;
if ($y >= 0.0) { return $y**5.0*exp(-$y); }
else { return 0.0; }
}
1.1.3 Dissection
The Perl script starts with a header
: # *-*-perl-*-*
eval exec perl -w -S $0 ${1+"$@"}
if 0; # if running under some shell
implies interpreting the code by the first perl program encountered in the
directories listed in your PATH environment variable. The explanation of all
the details in our Perl header is intricate, but it can be found in the file
src/perl/headerfun.sh. (This is actually a document written in Bash (!) so
you need to run the file to get the document printed.)
In the case where the user has failed to provide two command-line ar-
guments, we want to write a usage message and abort the script. This is
accomplished by Perls die statement: die prints a string on standard error
and terminates the script. In the present example the script dies if there are
less than two command-line arguments:
die "Usage: $0 infilename outfilename\n" if $#ARGV < 1;
Recall that $#ARGV is the last legal index in @ARGV, i.e., the length of @ARGV is
$#ARGV+1, so the test is $#ARGV+1 < 2, leading to $#ARGV < 1.
Extracting the first two command-line arguments can be performed by
standard subscripting:
$infilename = $ARGV[0];
$outfilename = $ARGV[1];
However, it is more common (and elegant) to use Perls list assignment con-
struction:
($infilename, $outfilename) = @ARGV;
The list on the left-hand side is set equal, entry by entry, to the entries in
the array on the right-hand side. We refer to the remark at the end of this
section for an explanation of the difference between list and array in Perl
terminology.
Opening files in Perl is done with the open function:
open(INFILE, "<$infilename"); # open for reading
open(OUTFILE, ">$outfilename"); # open for writing
1.1. A Scientific Hello World Script 5
The first argument to open is a file handle, which is used for accessing the
file in the Perl code. Input files are recognized by < in front of the name1 , >
signifies an output file, and >> implies that text will be appended to the file.
Reading from a file handle, line by line, is accomplished by
while (defined($line=<INFILE>)) {
# process $line
}
In the present script we want to split the line into an array of words, separated
by whitespace. The split function performs this task:
($x, $y) = split( , $line); # extract x and y value
One way of printing the transformed coordinate pair to the output file is to
apply the printf function:
printf(OUTFILE "%g %12.5e\n", $x, $fy);
The core of a printf call is the format string, which follows the same syntax as
in C and Python (and all other languages that supports the Cs printf style
for formatting). Perls ordinary print function can also be used for writing
to files, e.g., print OUTFILE "$x $fy\n";
The myfunc function is defined as
sub myfunc {
my ($y) = @_;
if ($y >= 0.0) { return y**5.0*exp(-$y); }
else { return 0.0; }
}
The most striking difference from subprograms in other languages is that the
argument list is not a part of the subroutine heading. Instead, all arguments
are available in an array @_. The first step is normally to store the arguments
in local variables:
1
If there is no < symbol, the file is opened for reading. In fact,
opentt(F,"<$name"), open(F,"$name"), and open(F,$name) all lead to open-
ing a file a file with name $name.
6 1. Introduction to Perl
The my keyword tells that all variables on the left-hand side are declared as
local variables in the subroutine. This is a good habit as using unintended
global variables inside a subroutine may have undesired effects in other parts
of the script.
As in Chapter 2.2Working with Files and Datasection.59 in [?], we can
modify datatrans1.pl such that (i) the file is loaded into an array of lines,
(ii) the x and y coordinates are stored in two arrays, and (iii) the output file
is written by a for loop over the array entries.
We start with making the open statement a bit more robust. Perl does not
by default write any error message if the file we try to open does not exist.
This can be quite annoying, but the problem is solved by a try something
or die construction:
open(INFILE, "<$infilename")
or die "unsuccessful opening of $infilename; $!\n";
The $! variable is a special variable in Perl containing the last error message
issued by the operating system.
Loading a file into an array of lines is enabled by the syntax
@lines = <INFILE>;
# equivalent syntax:
foreach $line (@lines) {
# process $line
}
In the present case we want to create two arrays, @x and @y, containing the
x and y coordinates:
@x = (); @y = (); # start with empty arrays
for $line (@lines) {
($xval, $yval) = split( , $line);
push(@x, $xval); push(@y, $yval);
}
The x and y coordinates are extracted by splitting the line with respect to
whitespace, exactly as we did in the datatrans1.pl code. The push function
appends new array entries.
Creating the output file can now be performed by a C-like for loop over
the array indices:
1.1. A Scientific Hello World Script 7
open(OUTFILE, ">$outfilename")
or die "unsuccessful opening of $outfilename; $!\n";
Recall that $#x is the last valid index in the array @x. The complete code is
found in src/perl/datatrans2.pl.
evaluates the list on the right-hand side in a list context, and @a becomes
an array variable having its entries equal to the three scalars in the list
("a","b","c"). When assigning the list to a scalar,
$a = ("a","b","c");
the list on the right-hand side is evaluated in a scalar context. In this case,
the value of the list is the value of the last element (as with the C comma
operator). Therefore, $a becomes "c". On the other hand,
$b = @a;
evaluates the array variable @a in a scalar context, and its value is then the
length of the array. That is, $b becomes 3.
These examples show that an array variable can have a list as value in
a list context and its length as value in a scalar context. A hash evaluated
in a scalar context becomes true if there are elements in the hash, and false
otherwise2 .
The property that an array evaluates to its length in a scalar context is
often taken advantage of by Perl programmers. Two common applications
are
2
There is more information in the scalar value, see the perldata man page.
8 1. Introduction to Perl
yields the date as a string; $t is "Sun May 13 09:02:27 2001", for instance. In
a list context,
@t = localtime();
localtime returns a list of nine values containing the time, day, month, year,
etc. (see perldoc -f localtime), and @t becomes an array of numbers (say)
(27, 2, 9, 13, 4, 101, 0, 132, 1).
# run simulator:
$cmd = "oscillator < $case.i"; # command to run
$failure = system($cmd);
die "running the oscillator code failed\n" if $failure;
1.2.2 Dissection
The script starts with a safe Perl header, which ensures interpretation of
the script by the first Perl interpreter found in the users path. After having
assigned default values to the input parameters to the oscillator code, we
encounter an important part of many scripts, namely parsing of command-
line arguments. The idea is that we eat the entries in @ARGV one by one
using the shift operator:
$option = shift @ARGV;
This statement implies setting $options equal to the first element in @ARGV
and then removing this element from @ARGV3 . We search for options on the
command line until the @ARGV array is empty:
while (@ARGV) { # while @ARGV is non-empty
$option = shift @ARGV; # load command-line arg. into $option
if ($option eq "-m") {
$m = shift @ARGV; # load next command-line arg
}
elsif ($option eq "-b") { $b = shift @ARGV; }
...
else {
die "$0: invalid option $option\n";
}
}
The syntax m=f means searching for the command-line argument --m and
loading the proceding argument as a floating-point number (=f) into the Perl
variable $m. A single hyphen as in -m works too. Similarly, func=s specifies
--func to take a string argument. The specification of the flag screenplot
allows us to use either --screenplot for setting $screenplot to a true value
or --noscreenplot for setting $screenplot to a false value (note to get this
on/off behavior, the exclamation mark is required in "screenplot = $screen-
plot!). The GetOptions function has a rich functionality; the purpose here just
is to notify the reader about the existence of such a handy function. Instruc-
tive information is obtained from perldoc Getopt::Long. There are several
other modules in the Getopt family. For example: Getopt::Simple for a sim-
plified interface to Getopt::Long, Getopt::Std for single-character options,
Getopt::Mixed for long and single-character options, and Getopt::Declare for
handling command-line options or configuration files with associated help
text and initialization code.
The next step in our script is to move to the prescribed directory. However,
we should first check whether the directory exists, and if so, we should delete it
and recreate it to avoid mismatch between old and new result files. Checking
if a directory exists is done by the command if (-d $directoryname) in Perl.
Removing a non-empty directory can be conveniently done by first loading an
external Perl module, use File::Path, and then calling the function rmtree
in that module:
use File::Path; # has the rmtree function
if (-d $dir) { # does $dir exist?
rmtree($dir); # remove directory (old files)
}
mkdir($dir, 0755) or die "Could not create $dir; $!\n";
chdir($dir) or die "Could not move to $dir; $!\n";
Observe that we test for success of mkdir. For example, insufficient permission
to create a new directory will not be noticable when running the script unless
we include the or die statement4 .
The next task is to write an input file for the oscillator program. Multi-
line output can easily be created through an ordinary string with embedded
newlines5
print F "
$m
$b
$c
$func
$A
$w
$y0
4
Python will in such cases abort the script and write a Permission denied mes-
sage to standard output. See Exercise 1.8.
5
Python requires a triple quoted string for this purpose.
12 1. Introduction to Perl
$tstop
$dt
";
Everything between the two EOF marks is treated as output text. The enclos-
ing EOF must start in the first column of the script file. The Gnuplot script
later in the simviz1.pl code is actually written as a here document.
Perls system function is used for running applications:
$cmd = "oscillator < $case.i"; # command to run
$failure = system($cmd);
die "running the oscillator code failed\n" if $failure;
# make plot:
$failure = system("gnuplot $case.gnuplot");
die "running gnuplot failed\n" if $failure;
Never forget to close files before continuing with system commands involving
the generated files!
searches all files (*) in the current working directory for the text string
superLibFunc and writes out the matches. This can help you finding the file
you are looking for. We shall present a cross-platform Perl script, which im-
plements the grep functionality.
which is a test whether the variable $line matches the regular expression
contained in $string. If so, we write out this line.
14 1. Introduction to Perl
#!/usr/bin/perl
die "Usage: $0 pattern file1 file2 ...\n" unless @ARGV >= 2;
($pattern, @files) = @ARGV;
foreach (@files) {
next unless -f;
open FILE, $_;
foreach (<FILE>) { print if /$pattern/; }
close FILE;
}
The next unless -f statement means that one jumps to the next iteration in
the loop unless the test if (-f $_) is true, i.e., unless the current filename
($_) is an existing file.
The while (<>) loop implies reading all lines in all files whose names are
in @ARGV. (If there are no filenames on the command line, <> reads from
standard input.) Since processing a list of files in a line-oriented fashion is
a frequently encountered task in scripts, while (<>) is a popular and widely
used construction that saves quite some typing. It goes without saying that
each line is available in the $_ variable.
Here, the -n option tells Perl to invoke a loop over all lines in all files specified
on the command line (equivalent to while (<>)) and execute the string after
-e as a Perl script applied to each line. Implicit here is that the line is stored
in the $_ variable.
#!/usr/bin/perl
die "Usage: $0 pattern file1 file2 ...\n" unless @ARGV >= 2;
($pattern, @files) = @ARGV;
foreach $file (@files) {
if (-f $file) {
open FILE, "<$file";
@lines = <FILE>;
@match = grep /$pattern/, @lines; # Perl grep
print "$file: @match";
close FILE;
}
}
The grep function searches for $string in a list of all the lines in the file and
returns a list with the lines that contain $string. Of course, this readable
script can be condensed to two lines if desired, using the <> notation:
#!/usr/bin/perl
$pattern = shift;
print grep /$pattern/, <>;
Remark. We should mention that reading the whole file into memory at once,
which is implied by @lines=<FILE> and also the <> operator, may face memory
problems if you work with large data files. The line-by-line reading can then
be more appropriate.
Extend this script such that the filename and the line number are printed
at the beginning of the lines that match the given string. You can count
the number of lines in the last foreach loop, or you can make use of Perls
special variable $., which holds the line number of the current line. Write
the line number in a field of width (say) 5 characters such that the out-
put is nicely aligned in three colums (filename, line number, line), see Ex-
ercise 8.4Exercisesexercise.483 on page 349Exercisesexercise.483 in [?] for a
sample output.
Observe how simple such an extension would have been if we had used
named variables instead of $_, or in other words, readability and extendability
are seldom well supported by extensive use of $_.
1.4. Frequently Encountered Tasks 17
Frequently encountered tasks in Perl scripts have been collected and orga-
nized in the present section, with the aim of providing a kind of example-
oriented quick reference for the reader. The following tasks are covered:
basic control structures,
file reading and writing,
(multi-line) output with format control,
executing other programs,
working with arrays and hashes,
splitting, joining, searching, and replacing text,
writing and calling Perl subroutines,
checking a files type, size, and age,
listing and removing files,
creating and removing directories,
moving to directories, traversing directory trees,
measuring CPU time,
building simple Perl modules,
working with regular expressions.
The for or foreach statement visits the entries in an array, entry by entry:
18 1. Introduction to Perl
$r = 0; $dr = 0.1;
do {
$s = sin($r); print "$s\n";
$r += $dr;
} while ($r <= 10);
The next statement continues with the next iteration in the loop:
# print lines not starting with #:
for $file (@files) {
next if not -f $file; # continue with next file
# process $file:
...
}
The recipe for opening a file for writing a list of lines is given next.
$outfilename = "myprog2.cpp";
open(OUTFILE, ">$outfilename") # open for writing
or die "Cannot write to file $outfilename; $!\n";
$line_no = 0; # count the line number in @lines
foreach $line (@lines) {
$line_no++;
print OUTFILE "$line_no: $line";
}
close(OUTFILE);
We can proceed with appending text to a file, using Perls features for writing
(large) blocks of text in one output statement, with embedded variables if
desired:
open(OUTFILE, ">>$filename") # open for appending
or die "Cannot append to file $filename; $!\n";
# print multiple lines at once, using a here document:
print OUTFILE <<EOF;
/*
This file, "$outfilename", is a version
of "$infilename" where each line is numbered.
*/
EOF
close(OUTFILE);
If you need to treat a file handle, such as OUTFILE, like a variable, e.g.,
when sending it to a function, you should use Perls FileHandle objects, see
perldoc FileHandle.
The return value from system is also available in the special Perl variable $?:
20 1. Introduction to Perl
To redirect the output from the application into a list of lines, one can
use back quotes:
$cmd = "myprog -c file.1 -p -f -q";
@res = $cmd;
Alternatively, one can open a pipe to the application and read the output as
if it were a file:
open(APP, "$cmd |");
@res = <APP>;
Pipes can also be used for running interactive applications. After having
opened a write pipe to a program, we can issue various commands, which are
executed upon closing the pipe. Here is an example involving the interactive
Gnuplot program:
open (GNUPLOT, "| gnuplot -persist"); # open a pipe to Gnuplot
print GNUPLOT "set xrange [0:10]; set yrange[-2:2]\n";
print GNUPLOT "plot sin(x)\n"; # draw a sine function
print GNUPLOT "quit\n";
close(GNUPLOT); # run Gnuplot with the commands
runs a loop over all lines in file1, file2, and file3. For each line, the Perl
commands provided inside the quotes (after the -e option) are executed,
and the -p option implies that the line is printed after execution of the
commands. Without the -i option the printing goes to standard output, but
with -i the files are modified in-place, i.e., the original file is replaced by the
new output. With -i.bak the file file1 is first copied to file1.bak before it
is being overwritten. The -p and -i.bak options are normally combined into
-pi.bak. Each line in the files is stored in $_. As an illustration we can let the
script specified by the -e option be s/float/double/g; meaning that float is
replaced by double in some files (here file1, file2, and file3):
1.4. Frequently Encountered Tasks 21
but @arglist does not have an array as the third element; the @arr arrays
entries are simply inserted in @arglist, i.e., @arglist now contains
($myarg1, "displacement", $var1, $var2, "tmp.ps");
To force the third entry to be the @arr array, this entry must be a reference
to @arr, obtained by prefixing @arr with a backslash (see page 29):
@arglist = ($myarg1, "displacement", \@arr, "tmp.ps");
New entries can be appended to an array using the push function, e.g.,
push(@arglist, $myvar2);
push(@arglist, @arr2);
The shift function returns and removes the first array element:
$first_entry = shift @arglist;
The pop function returns and removes the last array element:
$last_entry = pop @arglist;
Without arguments, shift and pop works on @ARGV in the main program and
@_ in subroutines, e.g.,
6
Similar list assignments in Python requires that the lists on each side of the
assignment operator have equal lengths.
1.4. Frequently Encountered Tasks 23
sub myroutine {
my $arg1 = shift; # same as shift @_;
my $arg2 = shift;
...
}
creates a new array @a where each element is a copy of the corresponding array
element in @b. To make a refer to the array b, as in the Python assignment a
= b, we need to let a be a reference:
$a = \@b;
See page 29 for more information about references and how to access the
values referred to by $a.
Reversing the order of the entries in an array is performed by the reverse
function:
@reversed_strlist = reverse(@strlist);
sub numeric_sort {
if ($a < $b) { return -1; }
elsif ($a == $b) { return 0; }
else { return 1; }
}
@sorted_list = sort numeric_sort @list;
With this technique you could develop various tools for initializing and pro-
cessing command-line options, and each time you need to add a new variable
and a corresponding option to the script, you can simply add one new line
to the initialization of the default values in the hash cmlargs.
results in @filenames as
("case1.ps", "case2.ps", "case3.ps")
The split function can also split with respect to a regular expression, just
as re.split in Python, e.g.,
$files = "case1.ps, case2.ps, case3.ps";
@filenames = split(/,\s*/, $files);
Note that in Perl, the comparison operators for strings and numbers are
different8 (e.g., eq and ne for strings vs. == and != for numbers, see also
Chapter 1.4.14).
Here is an example regarding substituting double by float everywhere in
a file:
$copyfilename = "$filename.old~~";
rename($filename, "$copyfilename"); # take a copy of the file
open(FILE," <$copyfilename") or die "$0: couldnt open file; $!\n";
$filestr = join("", <FILE>); # read lines and join them to a string
close(FILE);
Since the need for such types of file substitutions often arises, Perl offers a
one-line statement for accomplishing the task:
perl -pi.old~~ -e s/float/double/g; *.c
# another example:
$strpart = substr($filename, 3, 5);
# result: 34567
Stripping away leading and trailing blanks in a string is easily carried out
by regular expressions:
$line1 =~ s/^\s*//; $line1 ~= s/\s*$//;
Note that the regular expression split on colon is Unix specific. On Windows
we need to insert a semi-colon instead (note that /[:;]/ does not give a
cross-platform solution since colon is used in Windows paths, e.g., C:\). Also
note the need for double quotes in the second if test; writing $dir/$program
without double quotes would be an invalid mixture of variables and text (the
slash), or division of two text variables what we need is to construct a new
string using variable interpolation.
1.4.11 Subroutines
Functions in Perl are called subroutines. Subroutines take the form
sub name {
# extract local variables from the argument array @_
# body of routine
...
return # some data structure
}
The arguments are not part of the subroutine heading. Instead, they are
available in the array @_. Output variables are transferred to the calling code
by returning an appropriate data structure, e.g., a list of the various output
quantities. The return statement can be omitted.
The my keyword makes variables local to the subroutine9 . Unless you specify
a variable with my it is treated as a global variable whose value is visible
outside the routine as well. Frequently, one maps the @_ array onto suitable
local values using convenient list techniques, e.g.,
my ($a, $b) = @_;
This allows working with scalars, such as $a and $b, instead of the array
entries $_[0] and $_[1]. Alternatively, we can extract $a and $b using the
shift operator:
9
See [?] for a precise explanation of the my keyword.
1.4. Frequently Encountered Tasks 29
sub statistics {
# arguments are available in the array @_
my $avg = 0; my $n = 0; # local variables
Call by Reference. Modifying the arguments inside the subroutine, i.e., call
by reference, is enabled by working directly on the @_ array. For example,
swap($v1, $v2); # swap the values of $v1 and $v2
sub swap {
my $tmp = $_[0];
$_[0] = $_[1];
$_[1] = $tmp;
}
That is, @_ contains references to the variables used in the subroutine call10 .
We remark that the swap function is just an example on call by reference; the
elegant Perl way of swapping two variables reads ($v2,$v1)=($v1,$v2).
One can also pass references to variables to subroutines and in this way get
the effect of call by reference. A reference to a variable $a reads \$a. Having
the reference as a variable $a_ref, we can extract its value by ${$a ref}. We
may then write the swap function as
sub swap {
my ($a_ref, $b_ref) = @_; # extract references
# swap the contents of the underlying variables
my $tmp = ${$a_ref};
${$a_ref} = ${$b_ref};
${$b_ref} = $tmp;
}
swap(\$v1, \$v2);
10
Perl applies call by reference, and copying the arguments in @ into local variables
in a my statement simulates call by value.
30 1. Introduction to Perl
sub print2file {
my %args = (message => "no message", # default
file => "tmp.tmp", # default
@_); # assign and override
open(FILE,">$args{file}");
print FILE "$args{message}\n\n";
close(FILE);
}
Inside the subroutine we first assign default values to the hash entries and
thereafter we insert the argument list @_, which can be interpreted as a hash
as well. This latter hash might then override our default values. For example,
calling
print2file(file => $filename);
All the subroutines in the Perl libraries are declared before you use them so
you can omit parenthesis if you desire. Here are some examples:
print "No of iterations=$iter\n";
print("No of iterations=$iter\n");
my $index = 0; my $item;
for $item (@list) {
printf("item %d: %-20s description: %s\n",
$index, $item, $help[$index]);
$index++;
}
We refer to the Pass by Reference section of perldoc perlsub (or the equiv-
alent text in [?, p. 116-118]) for more information.
Now, suppose we have an array @xy1 similar to @points. The curves1 array
is supposed to contain a string, @points, another string, and @xy1. Again,
references are required to avoid flattening the structure:
@curves1 = ("u1.dat", \@points, "H1.dat", \@xy1);
yields $a as 0.1.
Nested data structures in Perl must make use of references, and it can
be troublesome to debug such structures. The Data::Dumper module converts
Perl data structures to readable strings: print Dumper(@curves1) results in
the present case in
1.4. Frequently Encountered Tasks 33
$VAR1 = u1.dat;
$VAR2 = [
[
0,
0
],
[
0.1,
1.2
],
[
0.3,
0
],
[
0.5,
-1.9
]
];
$VAR3 = H1.dat;
$VAR4 = [
[
0.3,
0
],
[
0.5,
-1.9
]
];
The Data::Dumper module supports lots of output formats, see perldoc Data::Dumper.
More information about references can be found in perldoc perlreftut.
creates three different Perl variables. Every time we use one of the variables,
the prefix immediately shows its type.
However, when working with references the prefix is always a dollar. The
function ref can be used to test what kind of underlying data structure the
reference is pointing to. The return value in a scalar context is a string, like
SCALAR, ARRAY, or HASH. In a boolean context, ref returns true if its
argument is a reference:
34 1. Introduction to Perl
The ref function is handy when you work with nested, heterogeneous data
structures. See perldoc -f ref and perldoc perlref for more information.
In the last test, the < operator works on numbers, and $b is interpreted as a
number (<, >, ==, =!, etc. are the comparison operators for numbers, whereas
strings must be compared with lt, gt, eq, ne, etc.).
# alternative:
@filelist = <*.ps *.gif>;
There are also tests for the size and age of a file:
1.4. Frequently Encountered Tasks 35
$size = -s $myfile;
$days_since_last_access = -A $myfile;
$days_since_last_modification = -M $myfile;
See perldoc perlfunc and search for -f, -d, and so on for information about
file tests.
The stat function gives more detailed results about a file:
($dev,$ino,$mode,$nlink,$uid,$gid,$rdev,$size,
$atime,$mtime,$ctime,$blksize,$blocks) = stat($myfile);
A quote from the description of stat in the man page perlfunc explains what
the various list entries above mean:
0 dev device number of filesystem
1 ino inode number
2 mode file mode (type and permissions)
3 nlink number of (hard) links to the file
4 uid numeric user ID of files owner
5 gid numeric group ID of files owner
6 rdev the device identifier (special files only)
7 size total size of file, in bytes
8 atime last access time since the epoch
9 mtime last modify time since the epoch
10 ctime inode change time (NOT creation time!) since the epoch
11 blksize preferred block size for file system I/O
12 blocks actual number of blocks allocated
There is an alternative stat function in the File::stat module, see perldoc File::stat.
Moving files across file systems is reliably done with the move function in
Perls File::Copy library:
use File::Copy;
move($myfile, "/work/temp") or die "Could not rename file\n";
Copying a file $file to a file $tmpfile is performed with the copy function in
the File::Copy library:
use File::Copy;
copy($file, $tmpfile);
36 1. Introduction to Perl
use File::Path;
mkpath("$ENV{HOME}/perl/projects/test1");
Occasionally, one wants to split this filename into the basename hw2a.pl and
the directory name /usr/home/hpl/scripting/perl/intro/:
1.4. Frequently Encountered Tasks 37
use File::Basename;
$basename = basename($fname);
$dirname = dirname($fname);
One can also extract the base of the basename, hw2a, either by
$base = $basename;
# or by substituting the file extension by an empty string:
$base =~ s/\.pl$//g;
sub ourfunc {
# $_ contains the name of the selected file
$file = $_;
# process $file
# $File::Find::dir contains the current directory
# (you are automatically chdir()ed to this directory)
# $File::Find::name contains $File::Find::dir/$file
}
We shall now implement a script that lists all files larger than 1Mb in the
home directory tree. The easiest way to extract the size of a file is to write
$size = -s $file;
sub printsize {
$file = $_; # more descriptive variable name...
if (-f $file) { # is $file a plain file, not a directory?
$size = -s $file; # or $size = (stat($file))[7];
if ($size > 1000000) {
printf("%.1fMb %s in %s\n",$size/1000000.0,$file,
$File::Find::dir);
}
}
}
38 1. Introduction to Perl
and realize that the resulting code has 55 (!) lines and is less cross-platform
than our hand-coded version.
If this one-liner gives an error message, you need to get libwww-perl from
CPAN (see page 54).
The Perl script lwp-download (from the libwww-perl package) fetches a
single file whose URL is known:
lwp-download http://www.ifi.uio.no/~hpl/downloadme.dat
The script looks at the file contents and creates a suitable local filename for
the copy. In this case, downloadme.dat is a text file that lwp-download stores
as downloadme.dat.txt. A second argument to lwp-download can be used to
specify a local filename.
Inside a Perl script we can easily copy a file, given as a URL, to a local
file:
use LWP::Simple;
$URL = "http://www.ifi.uio.no/~hpl/downloadme.dat";
getstore($URL, "downloadme.dat");
# copy only if local file is not up-to-date:
mirror($URL, "downloadme.dat.pl");
The URL in these examples could also have been an ftp address, e.g.,
ftp://ftp.ifi.uio.no/pub/blab/xite/xite3_4.tar.gz
There is also a higher-level module Benchmark, based on the time and times
functions, with various support for timing of Perl scripts. The usuage goes
as follows.
use Benchmark;
$t0 = new Benchmark;
# do some tasks...
$t1 = new Benchmark;
$td = timediff($t1, $t0); # time difference between $t0 and $t1
$nice_td_formatting = timestr($td, noc);
print "tasks: $nice_td_formatting\n";
The Benchmark module has also a function timeit that runs a piece of Perl
code a specified number of times:
use Benchmark;
print "100 runs took", timestr(timeit(100,\&somefunc)), "\n";
40 1. Introduction to Perl
Perl executes this script without any error message12. The Fatal module can
be used for letting Perl speak up about run-time errors:
perl -e use warnings; use strict; use diagnostics; \
use Fatal qw/open/; local *F; \
open(F,"<mynonexistingfile"); close(F);
12
Python provides instructive run-time messages by default in similar examples
(and the messages can be turned off by, e.g., appropriate exception handling in
the script).
1.4. Frequently Encountered Tasks 41
Note that you must list the functions you want to be verbose, here open. The
reported error message now contains the helpful message
Cant open(F, <mynonexistingfile): No such file or directory
The use warnings, use strict, and use diagnostic commands can help you
detecting statements that are candidates for trouble. However, applying use strict
modules to (most of) the Perl scripts in this appendix will result in lots of
error messages about lack of the main:: prefix for all global variables or an
explicit my or local operator (to make variables local). For quick scripting
this can be a bit annoying. When writing larger scripts, on the other hand,
use strict is a good habit. Here is a sample code demonstrating some im-
plications of use strict:
use strict;
# introduce the global variable $counter for the first time:
$counter = 1; # generates error message
$main::counter = 1; # ok, explicit indication of package name
my $counter = 1; # ok, localizing $counter with the my operator
my $counter; $counter = 1; # equiv. with the line above
The reader is encouraged to take a look at the man pages for the Fatal,
strict, and diagonstic modules. For details on warnings, see perldoc warnings
and man perllexwarn.
Inserting print statements on the fly in the code is an efficient and widely
used debugging method among Perl programmers. Alternatively, the -d op-
tion to a Perl script enables you to interactively debug the script through a
command-line debugger,
perl -d -w mybuggyscript.pl
The -w option turns on many useful warnings about, e.g., unused variables.
The most important commands inside the debugger are s for single step, n
for single step without stepping into subroutines, x for pretty-print of data
structures and variables, and b 85 for setting a break point at line 85. More
detailed information is provided by perldoc perldebug.
There is a Perl/Tk GUI for the Perl debugger, available in the module
ptkdb. Invoke the debugger by
There are several Perl debuggers with graphical interfaces, check out the links
in the Perl resources section in doc.html.
Another Perl module is Devel::Trace, which prints each statement prior
to executing it (the same effect as the -x option to Unix shell scripts).
42 1. Introduction to Perl
Extracting Multiple Matches. Suppose you have a string with several num-
bers. To extract all numbers from this string, without knowing how many
numbers there may be, we can apply the following Perl construct13 :
13
This construct is a counterpart to Pythons findall function in the re module.
1.4. Frequently Encountered Tasks 43
The array @n now contains the entries 3.29, 4.2, and 0.5.
The forward slashes in the path names must be quoted because the forward
slash is a delimiter for the substitution operator. Fortunately, this Leaning
Toothpick Syndrome [?, p. 70] can be avoided by choosing another delimiter,
e.g.,
$line =~ s#/usr/bin/perl#/usr/local/bin/perl#g;
# or
$line =~ s{/usr/bin/perl}{/usr/local/bin/perl}g;
The pattern is replaced by replacement everywhere in the files *.c and *.h.
A nice feature is that the -i.bak option leads Perl to take a copy, here with
extension .bak, of the original files.
44 1. Introduction to Perl
Note the need for double backslashes when the regular expression is stored
as an ordinary Perl string. The complete script for the substitution is found
in src/perl/swap1.pl. Another version, appearing in the file swap2.pl, has
comments in the regular expression:
$arg = "[^,]+";
$call = "superLibFunc # name of function to match
\\s* # possible whitespace
\\( # left parenthesis
\\s* # possible whitespace
($arg) # first argument plus optional whitespace
, # comma between the arguments
\\s* # possible whitespace
($arg) # second argument plus optional whitespace
\\) # closing parenthesis
";
The final x is the pattern-matching modifier for comments and extra whites-
pace in regular expressions. A more typical Perl style is to write the previous
code segment without storing the regular expression in a string variable:
$filestr =~ s{
superLibFunc # name of function to match
\s* # possible whitespace
\( # left parenthesis
\s* # possible whitespace
($arg) # first argument plus optional whitespace
, # comma between the arguments
\s* # possible whitespace
($arg) # second argument plus optional whitespace
\) # closing parenthesis
}{superLibFunc($2, $1)}gx;
sub debugregex {
my ($pattern, $str) = @_;
$s = "does " . $pattern . " match " . $str . "?\n";
if ($str =~ /$pattern/) {
# obtain a list of groups (if present):
@groups = $str =~ m/$pattern/g;
if ($groups[0] == 1) {
# ordinary match, no groups (see perlop man page)
} else {
for $group (@groups) {
$s = $s . "\ngroup: " . $group;
}
}
} else {
$s = $s . "No match";
}
return $s;
}
The Perl version of debugregex is perhaps less useful than the Python version
since regular expressions are seldom stored in strings in Perl. Instead they
appear directly inside /.../ constructions, and to use debugregex, we need to
copy the regular expression, store it in a string, and quote special characters
before performing the call. However, listing debugregex illustrates useful code
segments that can be reused in your own scripts when debugging regular
expressions.
1.4.27 Exercises
Most of the Python exercises from Chapters 2Getting Started with Python
Scriptingchapter.498Advanced Pythonchapter.453 in [?] are well suited for
implementation in Perl. For an efficient hands-on training with Perl, we rec-
ommend in particular the following set of exercises: 2.4Exercisesexercise.82,
2.6Exercisesexercise.84, 2.7Exercisesexercise.85, 2.11Exercisesexercise.111, 3.2Exercisesexercise.164,
3.15Exercisesexercise.198, 3.7Exercisesexercise.169, 3.16Exercisesexercise.199,
8.7Exercisesexercise.486, 8.12Exercisesexercise.491, 8.18Exercisesexercise.497,
and ??.
File handles for standard input and standard output, used in Exercise 2.6Exercisesexercise.84
in [?], have the names STDIN and STDOUT, respectively, in Perl.
Regarding Exercise 3.15Exercisesexercise.198 on page 128Exercisesexercise.198
in [?], check out Perls documentation of the localtime function for extract-
ing the correct date information (just write the operating system command
perldoc -f localtime).
The relevance of Exercise 8.17Exercisesexercise.496 in [?] for Perl pro-
grammers is minor, because Perl has a special variable $/, the input record
separator, which you can set to an empty string "" to make Perl read input
in a paragraph-by-paragraph style. For example,
$/ = "";
@paragraphs = <SOMEFILE1>;
# each entry in @paragraphs is now a paragraph
# alternative:
$/ = "";
while (<>) {
# $_ is now a paragraph
}
One hint might be to find the source of the man pages and search for special
constructions.
Write a Bourne shell script that first writes an explanation of what the
commands do to the screen and then demonstrates the command through
an example (generate file(s) in the Bourne shell script before running the
commands).
Such a tool is useful when you want to make portable scripts that are always
interpreted by, e.g., your own Perl installation. Search a set of directory trees
and for each executable file that is not a directory, check if the first line
matches text of the form #s*/.*/perl!, and if so, edit the file automatically.
In case you become interested in the magic of Perl headers, you are en-
couraged to run the script src/perl/headerfun.sh.
48 1. Introduction to Perl
In this interface, the -s option is used to specify a subject for the email,
-c is used to insert a text in the body of the email, -a is used to assign an
email address, and -x takes a list of extensions of files to be mailed (list items
are separated by comma, and the list is enclosed in parenthesis; the quotes
are used to tell the shell that the list is one command-line argument). The
rest of the command-line arguments are the file or directory names to be
mailed. For example, doc/README is typically a file, whereas src, app/test1,
and app/test2 could be directory trees. All files with extensions coinciding
with those given through the -x option will be picked out from these directory
trees and mailed. If the -x is not specified, all files in the trees are to be sent.
The inner workings of mailfiles.pl will consists of (i) creating a list of all
files to be mailed, then (ii) packing these files into one single file using a tar
facility, (iii) compressing the tarfile, and finally (iv) including the compressed
tarfile as an attachment in a mail. The body of the mail should consist of the
users comments given through the -c option to mailfiles.pl in addition to
a list of all the files that are included in the tarfile.
Perl has some useful modules for accomplishing the tasks in the mailfiles.pl
script: Archive::Tar creates tarfiles [?, p. 164], Compress:Zlib can be used for
gzip/gunzip file compression, and Mail::Send [?, p. 349] provides functions for
sending emails with attachments. Information on the usage of these modules
is provided by the associated man pages.
Write the mailfiles.pl script and write also the inverse script mail2files.pl,
which packs out the attached tarfile from an email saved to file.
However, this command does not seem to work. Can you explain what is
going on in detail and create a script that works as intended?
package MyMod;
sub set_logfile {
($logfile) = @_;
print "logfile=$logfile\n";
}
sub another_routine {
print "inside another_routine\n";
for $arg (@_) { print "argument=$arg\n"; }
}
1;
50 1. Introduction to Perl
The package statement defines a namespace MyMod, i.e., all variables and func-
tions in a user code must be prefixed by the package name: MyMod::. This
avoids name clashes with other modules that have subroutines or variables
with the same names (e.g., set_logfile or logfile).
A module file must end with a statement that evaluates to true. The
standard choice is 1; and Perl will issue a compilation error if you forget this
crucial statement.
In the users code the MyMod module is imported by writing use MyMod.
This import statement will be successful only if Perl knows where to find
your module. There are four ways of telling Perl about your directory of
modules:
1. apply the use lib statement, e.g., use lib "/home/hpl/lib";
2. set the PERLLIB or PERL5LIB environment variables,
3. modify Perls list of search directories for modules, or
4. install the module in the official Perl library directories.
Suppose we have stored the MyMod.pm in the subdirectory src/intrp/perl
under the directory stored in the environment variable scripting. Let us
explain the details of the four technqiues for making Perl aware of our MyMod
module. The use lib statement has the form
use lib "/home/hpl/scripting/src/perl";
We can work with paths built from environment variables, to make the script
more portable:
use lib "$ENV{scripting}/src/perl";
In the case Perl appears to have trouble with finding libraries, it will print the
content of the @INC variable, thus letting you check if all required directories
are present or not.
1.4. Frequently Encountered Tasks 51
MyMod::set_logfile(tmp.log);
print "MyMod::logfile=$MyMod::logfile\n";
$p1 = "just some text";
MyMod::another_routine($p1, mytmp.tmp);
The visibility of the modules functions and variables can easily be controlled
(public versus private access). We refer to the description in [?] for details.
The example of a Perl module presented in this section is very simple;
Perl has many additional features to control the behavior of a module. An
overview of making modules is provided by the question How do I create a
module? in the Perl FAQ (see link from doc.html). You can look up [?] for
a more comprehensive description of making modules.
The format specification for the most common formats is the same as in
Python, see perldoc -f pack for a complete list. Writing an array of real
numbers on file in C double format can be peformed by the loop
foreach $r (@array) { print FILE pack(d,$r); }
Reading C doubles from file can be done using read to get a specified number
of bytes and then unpack to convert a binary number to a Perl variable:
@array = ();
$d = length(pack(d)); # no of bytes in a C double
# read $d bytes into $data in each pass in the loop:
while (($n = read(FILE, $data, $d)) == $d) {
push(@array, unpack(d, $data); # convert to Perl variable
}
52 1. Introduction to Perl
If this script executes successfully, the next step is to compile Perl by typing
make. To test if the building of Perl was successful, type make minitest (can be
omitted). To install Perl, type make install. Object files and other temporary
files from building Perl can be removed by the command make clean.
and then
make
Sometimes you will see that a module depends on other modules, so you
need to install these first.
4. To test the module you can write make test. This step is optional.
5. To install the module such that Perl can find it when you say use SomeMod
in some script, you need to run
make install
This will copy the necessary files to the directories where Perl is installed.
If you have your own installation of Perl, this will work fine, but if Perl
was installed by your system administrator, you probably do not have
write permission in the relevant directories. In that case, you can install a
private copy of the module by specifying a path to the desired installation
directory (say /home/hpl/lib:
perl Makefile.PL LIB=/home/hpl/lib
When you use the module in a Perl script you need to make Perl aware
of where the module is installed, e.g., by writing
use lib "/home/hpl/lib";
prior to use SomeMod (see Chapter 1.4.28).
and answer no to the question Are you ready for manual configuration?.
This will initiate an automatic configuration, which might be sufficient in
many cases, especially if you have your own Perl installation. In case you do
not like the decisions that were made, invoke the CPAN shell again and issue
the command o conf init to revisit the dialog. For example, you may desire
to specify a LIB option when the CPAN module runs perl Makefile.PL for
you (see the description of manual installation of Perl modules).
To install a module, invoke the CPAN shell as described above and just
say install and the module name, e.g.,
install Tar
The latest version of the Tar module will be fetched from a nearby CPAN
site, together with all the modules that Tar depends on and that you do not
already have. The tarfiles are unpacked and the installation procedure is run.
You can also install modules without using the interactive CPAN shell:
54 1. Introduction to Perl
We remark that even in this case some of the installation procedures will
prompt the user for extra information, but the default values will often be
sufficient. You can perform a manual installtion if you run into problems (by
default, the CPAN shell packs out the sources in .cpan/build in your home
directory just go to a modules directory and issue the commands required
for manual installation).
To learn about the modules, look at their man pages (write, for instance,
perldoc Tar).
# for an exercise:
perl -MCPAN -e install Tar
the same type of code in Perl. A Python slogan15 is there should be one
and preferably only one obvious way to do it, in contrast to Perls
Theres More Than One Way To Do It (see Chapter 1.3). The one-way-
to-do-it philosophy in Python eliminates to a large extent the need for
detailed programming standards in a project. Different Perl cultures use
different styles and constructs, a fact that can be confusing when novice
Perl programmers read other peoples code.
Python has easy-to-use data structures. Lists and dictionaries (hashes)
can be heterogeneous and arbitrarily nested in Python and Perl, but
in Python there is no need to work with references and the associated
dereferencing syntax as in Perl. That is, the Python syntax is intuitive
and makes it easy to work with complicated, nested data structures.
Morover, the keys in Python dictionaries can be arbitrary (immutable)
objects, whereas the keys in Perl hashes are limited to plain text.
Python has full and natural support for object orientation. Object-oriented
programming is more awkward in Perl as it was added at a late stage in
the development of the language.
Pythons application domain is wide. Python is a combination of a script-
ing language for gluing existing applications (for which Unix shells, Tcl,
and Perl are competing tools) and a system programming language with
object-orientation and support for complicated, nested, heterogenesous,
user-defined data structures (an area where C++, Java, and to some
extent Perl are competing tools). Perl also supports complicated data
structures, but with a less attractive syntax; in Python you can work
directly with the objects in the structure, while in Perl you must work
through references, a fact that clutters the code with combinations of
@, $, and backslashes and makes it less readable. (Compare, for example,
the displaylist subroutine on 31 with the corresponding straightforward
implementation in Python.)
Python supports multi-language programming in an easy way. There are
several tools that make it easy to combine Python with C, C++, or
Fortran libraries, as shown in Chapters 5Combining Python with For-
tran, C, and C++chapter.280, 9Fortran Programming with Numerical
Python Arrayschapter.597, and 10C and C++ Programming with Nu-
merical Python Arrayschapter.631 in [?]. Combining Perl with C++, and
in particular Fortran, is less well supported.
Python programs can easily be equipped with a GUI. There are well-
developed interfaces to various GUI tools, e.g., wxWindows, Qt, Gtk, Java
Foundation Classes (JFC), and MFC (Microsoft Foundation Classes), be-
sides this books main GUI library: Tk. Although Tk, and other libraries
like Gtk, can be used from Perl as well, we find that Pythons simple
15
Type import this in a Python interpreter to see this and more slogans.
56 1. Introduction to Perl
Perl is particularly popular for writing CGI scripts, i.e., creating inter-
active Web pages. Perls CGI tools are more comprehensive than those
available for Python.
Perl has (together with Python) more comprehensive interfaces to op-
erating system commands and file operations than most other scripting
languages.
Regular expression syntax is best documented in a Perl context, i.e., you
need to understand Perl code to look up all the good regular expression
documentation. Although Python supports most of Perls regular expres-
sion syntax, many programmers (even Python fans) prefer Perl for heavy
text processing with regular expressions.
Perl has some nice features that allow small scripts to be expressed as
very compact one-line operating system commands.
Perls Theres More Than One Way To Do It principle gives the pro-
grammer greater flexibility (and perhaps more fun and challenges) than
when using Python to solve a problem.
Perl is usually faster than Python up to about twice as fast for typical
application areas of scripting in computational science and engineering.
1.6.3 Efficiency
It may be of interest to compare the efficiency of Perl versus Python. We
have made a series of scripts in the directory tree src/efficiency for com-
parisons of Perl, Python, and to some extent Tcl, C, and C++. The relative
efficiency of the datatrans1.* scripts is reported in Chapter 2.2.7Efficiency
Measurementssubsection.76 in [?]. In that example we found that Perl ran
almost twice as fast as Python. In the following we present two other appli-
cations concerning reading text files with regular expressions and performing
mathematical operations in for loops. We remark that the relative efficiency
of the two languages reported here depends on the version of the interpreters,
the C compiler used to compile them, and the hardware platform. The best
approach of extrapolating the results to your own programming life is to
repeat the tests on your own machine.
meaning that a three-dimensional array has the value 16.7 for the indices
(2, 10, 0). The purpose is then to read this file, line by line, extract the indicies
and the array value, using the regular expression
\[(\d+),(\d+),(\d+)\]=(.*)
58 1. Introduction to Perl
and fill the corresponding array item. The Python implementation, in regexread.py,
applies a three-dimensional NumPy array, whereas the Perl implementation,
in regexread.pl, stores the numbers in a one-dimensional array (since three-
dimensional arrays in Perl go via references and hence needs some extra loops
initially). The data file is created by the makedata.py script.
As usual in the efficiency tests in [?], we normalize the CPU time by the
CPU time of the fastest implementation. Perl then ran at 1.0 and Python at
3.3.
(A faster solution to this programming would be to avoid regular expres-
sions and instead split the line first with respect to =, strip the first (left-hand
side) string, remove the braces (first and last character), and then split with
respect to comma. This implementation and its relative efficiency to using
regular expression is left as an exercise.)
If we instead load a data file where the array data are written in NumPy
format, such that a plain eval(file.read()) type of statement recreates the
array, the Python script (readeval.py) is very short and ran at 2.8 CPU time
units. These numbers are independent of the size of the array as long as the
array fits into memory.
From these tests on interpreting text files with regular expression, we may
conclude that Perl is more than three times faster than Python. The reading
in Python is a bit faster if the array is in NumPy format (which is expected
since there is less text to interpret in this format). We remark that large
array structures should perferably be stored in binary format. The difference
between Perl and Python is then much smaller.
Rb
Function Evaluations. The next test concerns computing the integral a f (x)dx
by the Trapezoidal rule:
1 X
n1
1
I= f (a) + f (a + ih) + f (b), h = (b a)/n .
2 i=1
2
Documentation. A book [?] is devoted to GUI building with Perl. This book
also contains a complete reference to Perl/Tk widgets. Tk comes, of course,
with man pages that can be accessed with perldoc: perldoc Tk::Scale, for
instance. There is also a Perl/Tk quick reference book [?]. The doc.html con-
tains links to the Perl/Tk FAQ as well as to electronic Perl/Tk introductions.
The source of the Perl/Tk distribution contains a demo of Perl/Tk wid-
gets. Go to the Perl/Tk packages directory, then to its demos subdirectory,
and write perl widgets. A GUI appears with a list of Tk demos. For each
demo you can view the source code in a separate window. Newcomers to
Perl/Tk might be confused by the Perl framework for organizing the de-
mos. However, concentrating on the specific widget commands and knowing
the meaning of the qw operator (see page 22) should be sufficient for taking
advantage of this collection of Perl/Tk examples.
extension. This will point out how easy it is to use Tk from Perl once you are
familiar with the Tk widgets, their functions, arguments, and functionality.
The scripts presented in this section are found in the directory src/perl.
use Tk;
# create main window perl variable ($main_win) to hold all widgets:
$main_win = MainWindow->new();
$top = $main_win->Frame(); # create frame
$top->pack(-side => top); # pack frame in main window
$r = 1.2; # default
$r_entry = $top->Entry(-width => 6, -relief => sunken,
-textvariable => \$r);
$r_entry->pack(-side => left);
MainLoop();
Let us explain the script line by line. The first three lines are standard and
ensure that we call the first Perl interpreter in our path, see Chapter 1.1.1.
Any Perl script using the Tk extension must begin with
use Tk;
you need to get Perl/Tk installed. Instructions are provided in Chapter 1.5.
Before we can create and pack widgets with Perl/Tk we need to make a
main window:
$main_win = MainWindow->new();
Having the main window, we normally start with making a frame to hold all
our widgets and subframes:
$top = $main_win->Frame(); # create frame
$top->pack(-side => top); # pack frame in main window
The reader should notice the close similarity with the corresponding code
expressed in Python/Tkinter:
hwtext = Label(top, text="Hello, World! The sine of")
hwtext.pack(side=left)
If desired, we can merge the creation of the Label object and its packing in
one statement:
$top->Label(-text => "Hello, World! The sine of")
->pack(-side => left);
Now we do not have any variable holding the label anymore, which means that
we cannot update (configure) the label later. Usually, this is a disadvantage.
A text entry tied to the Perl variable $r is created by calling the Entry
method in our top frame:
$r = 1.2; # default
$r_entry = $top->Entry(-width => 6, -relief => sunken,
-textvariable => \$r);
$r_entry->pack(-side => left);
The width of the text entry equals 6 characters, and the entry is displayed
with a sunken relief, giving a 3D effect in the GUI. Technically, we send a
reference to the variable $r, denoted by \$r in Perl, as argument to the entry
widget.
The button in our GUI is supposed to compute $s = sin($r) when being
pressed. The corresponding Perl/Tk code is
$compute = $top->Button(-text => " equals ", -command => \&comp_s);
$compute->pack(-side => left);
or shorter:
$top->Label(-textvariable=>\$s, -width=>18)->pack(-side=>left);
Finally, we need to call the event loop when all GUI components are declared
and packed:
MainLoop();
This loop waits for the users events, such as pressing buttons and writing
text, and performs the corresponding actions as defined by the widgets. If
you forget to call MainLoop, nothing will be shown on the screen and the script
just hangs.
$r_entry->bind(<Return>, \&comp_s);
Hint: The PhotoImage class in the Python script takes the name Photo in
Perl/Tk, and the construction is also slightly different. Another difference
between Python/Tkinter and Perl/Tk is that the from_ argument in Scale
has the stright name from in Perl/Tk. Check out the Tk::Image and Tk::Scale
man pages with perldoc.
The number of Perl users increased dramatically in the latter half of the
1990s when Perls powerful text processing features were recognized to make
interactive Web pages based on the Common Gateway Interface (CGI) much
easier than in traditional languages like Fortran, C, C++, and even Java.
Perl is now a core technology in the Internet programming world, and several
modules for CGI programming are available.
Perl offers more CGI and network programming modules than Python.
On the other hand, the powerful Zope and Plone tools for managing dynamic
Web pages are based on and programmable from Python. The optimal CGI
programming environment is therefore likely to be a Python program calling
up special Perl functionality when desired. This is fortunately a reality as the
company ActiveSate has created a tool pyperl for calling Perl from Python.
Some Python tools for CGI programming are explained in Chapter 7Web
Interfaces and CGI Programmingchapter.408 in [?]. Here we shall have a look
at similar tools in Perl and see how they apply to the examples from in [?].
This implies that the present section is written for readers that have grasped
the basics of CGI programming in Python from Chapter 7Web Interfaces and
CGI Programmingchapter.408 in [?].
This file is identical to the hw1-py.html file from page 291Web Forms and
CGI Scriptssubsection.412 in [?], used in conjunction with a CGI script in
Python, except from the ACTION parameter, which now specifies a Perl script
hw1.pl.cgi to be called. This CGI script is almost a line-by-line translation of
the similar script in Python. Perl has a module called CGI with easy access to
form variables. As an example, the value of the form parameter r is extracted
by calling the param function in the CGI module:
use CGI;
$form = CGI->new();
$r = $form->param("r");
With this information, and having a look at the Python counterpart hw1.py.cgi,
we can easily create the CGI script in Perl:
#!/usr/local/bin/perl
use CGI;
# required opening of all CGI scripts with output:
print "Content-type: text/html\n\n";
# extract the value of the variable "r" (in the text field):
$form = CGI->new();
$r = $form->param("r"); $s = sin($r);
# print answer (very primitive HTML code):
print "Hello, World! The sine of $r equals $s\n";
The script is found in the file src/perl/hw1.pl.cgi. Observe that the output
yields an incomplete HTML code, but it will most likely be correctly shown
in a browser.
In an improved version this CGI script we let the user stay within the
same page, i.e., the Web page acts as a sine calculator. This is accomplished
by letting the script generate the HTML code for the Web form:
#!/usr/local/bin/perl
use CGI;
# required opening of all CGI scripts with output:
print "Content-type: text/html\n\n";
# extract the value of the variable "r" (in the text field):
$form = CGI->new();
if (defined($form->param("r"))) {
$r = $form->param("r"); $s = sin($r);
} else {
$r = "1.2"; $s = ""; # default
1.8. Web Interfaces and CGI Programming 65
}
# print form with value:
print <<EOF;
<HTML><BODY BGCOLOR="white">
<FORM ACTION="hw2.pl.cgi" METHOD="POST">
Hello, World! The sine of
<INPUT TYPE="text" NAME="r" SIZE="10" VALUE="$r">
<INPUT TYPE="submit" VALUE="equals" NAME="equalsbutton"> $s
</FORM></BODY></HTML>
EOF
even when no form variables are available (which is the case the first time
the script is launched). However, the sine computation requires that $r is a
valid real number. The script can be found in the file src/perl/hw2.pl.cgi
and is a line-by-line translation of the similar Python script.
Alternatively, you can notify any Perl interpreter about the location of the
directory where CGI::Debug is installed, e.g.,
16
The need for a wrapper script, as explained in Chapter 7.1.4A General Shell
Script Wrapper for CGI Scriptssubsection.421 in [?], is not necessary unless you
have linked your Perl interpreter with special, local shared libraries.
66 1. Introduction to Perl
#!/usr/local/bin/perl -w
use lib /some/long/path/to/my/own/lib; # CGI::Debug is here
use CGI::Debug;
Writing use CGI::Debug turns on debugging facilities. This will not affect a
working script. We have included CGI::Debug and introduced two errors in a
version of the hw2.pl.cgi script (the erroneous version is called hw2e.pl.cgi).
After the CGI object form is created we write some statements that trigger
warnings and errors:
print "$undefined_var\n"; # warning
print "$form->param("undefined_key")\n"; # error
# CGI scripts are normally not allowed to write files:
open(FILE, ">myfile") or die "Cannot open myfile!";
Running the script in a browser leads to a compilation error from Perl and
a corresponding report written to the browser by CGI::Debug. The report is
something like
syntax error at /some/path/hw2e.pl.cgi line 13,
near ""$form->param("undefined_key"
Execution of /some/path/hw2e.pl.cgi aborted due to compilation errors.
Parameters
----------
equalsbutton = 6[equals]
r = 3[1.2]
imports the most standard CGI functions directly into the namespace so do
not need to work with a CGI object. This enables the following more readable
version of the script:
#!/usr/local/bin/perl
use CGI qw/:standard/;
print header,
start_html(-title=>"Hello, Web World!",
-BGCOLOR=>white),
start_form,
"Hello, World! The sine of ",
textfield(-name=>r, -default=>1.2, -size=>10),
"\n", submit(-name=>equals), " ";
if (param()) { $r = param("r"); $s = sin($r); }
else { $s = sin(1.2); }
print $s, "\n", end_form, end_html, "\n";
show_form(
-ACCEPT => \&on_valid_form, # must be supplied
-TITLE => "Hello, Web World!",
-FIELDS => [
{ -LABEL => Hello, World! The sine of ,
-TYPE => textfield, -name => r,
-default => 1.2, },
],
-BUTTONS => [ {-name => compute}, ], # "submit" button(s)
);
sub on_valid_form {
my $r = param(r);
my $s = sin($r);
print header, $s; # write new page with the answer
}
The default value can, of course, also be inserted at construction time of the
entry.
When we need to read the contents of the entry (in the comp_s routine)
we simply use the get command:
set r [ .sine.r get ]
set s [ expr sin($r) ]
where .sine.s is the name of a label widget. This means that the comp_s
function takes the form
102 2. Introduction to Tcl/Tk
proc comp_s { } {
global .sine.r .sine.s; # access global widgets
set r [ .sine.r get ]
set s [ expr sin($r) ]
.sine.s configure -text $s
}
Exercise 2.1. Tcl/Tk version of the GUI in Chapter 6.2Adding GUIs to Scriptssection.345
in [?].
Write the simviz1.py script from Chapter 6.2Adding GUIs to Scriptssection.345
in [?] in Tcl/Tk. You can use the Scientific Hello World GUIs as starting
point, but you need to consult the Tk man pages to find the proper syntax
for a slider and a picture in Tcl/Tk.
Hint: PhotoImage in the Python script is a counterpart to the image pro-
cedure in Tk, being an example of different names in the Python and Tcl
interfaces to Tk. The construction of the image is also different.