Vous êtes sur la page 1sur 78

Scripting with Perl and Tcl

Hans Petter Langtangen


Simula Research Laboratory
and
Department of Informatics
University of Oslo
Table of Contents

1 Introduction to Perl . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.1 A Scientific Hello World Script . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
1.1.1 Reading and Writing Data Files . . . . . . . . . . . . . . . . . . . . . 3
1.1.2 The Complete Code . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
1.1.3 Dissection . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
1.1.4 The Concept of Context in Perl . . . . . . . . . . . . . . . . . . . . . 7
1.2 Automating Simulation and Visualization . . . . . . . . . . . . . . . . . . . 8
1.2.1 The Complete Code . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
1.2.2 Dissection . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
1.3 Theres More Than One Way To Do It . . . . . . . . . . . . . . . . . . . . . . 12
1.3.1 A Script for Perl Beginners . . . . . . . . . . . . . . . . . . . . . . . . . 13
1.3.2 Using the Underscore Variable . . . . . . . . . . . . . . . . . . . . . . . 14
1.3.3 A Script Written in Typical Perl Style . . . . . . . . . . . . . . . . 14
1.3.4 Shorter Scripts for Lazy Programmers . . . . . . . . . . . . . . . . 15
1.3.5 The Ultimate Goal: Getting Rid of the Script File . . . . . 15
1.3.6 Perl Has a Grep Function Too . . . . . . . . . . . . . . . . . . . . . . . 15
1.4 Frequently Encountered Tasks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
1.4.1 Basic Control Statements . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
1.4.2 File Reading and Writing . . . . . . . . . . . . . . . . . . . . . . . . . . . 18
1.4.3 Running an Application . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19
1.4.4 One-Line Perl Scripts . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20
1.4.5 Array and List Operations . . . . . . . . . . . . . . . . . . . . . . . . . . 21
1.4.6 Hash Operations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24
1.4.7 Splitting and Joining Text . . . . . . . . . . . . . . . . . . . . . . . . . . 25
1.4.8 Text Processing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26
1.4.9 String Operations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26
1.4.10 Environment Variables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27
1.4.11 Subroutines . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28
1.4.12 Nested, Heterogeneous Data Structures . . . . . . . . . . . . . . . 32
1.4.13 Testing a Variables Type . . . . . . . . . . . . . . . . . . . . . . . . . . . 33
1.4.14 Numerical Expressions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34
1.4.15 Listing of Files in a Directory . . . . . . . . . . . . . . . . . . . . . . . 34
1.4.16 Testing File Types . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34
1.4.17 Copying and Renaming Files . . . . . . . . . . . . . . . . . . . . . . . . 35
1.4.18 Creating and Moving to Directories . . . . . . . . . . . . . . . . . . 36
1.4.19 Removing Files and Directories . . . . . . . . . . . . . . . . . . . . . . 36
1.4.20 Splitting Pathnames . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36
1.4.21 Traversing Directory Trees . . . . . . . . . . . . . . . . . . . . . . . . . . 37
1.4.22 Downloading Internet Files . . . . . . . . . . . . . . . . . . . . . . . . . . 38
1.4.23 CPU-Time Measurements . . . . . . . . . . . . . . . . . . . . . . . . . . . 39
1.4.24 Programming with Classes . . . . . . . . . . . . . . . . . . . . . . . . . . 40
1.4.25 Debugging Perl Scripts . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40
1.4.26 Regular Expressions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42
1.4.27 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46
1.4.28 Building and Using Modules . . . . . . . . . . . . . . . . . . . . . . . . 49
1.4.29 Binary Input/Output . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51
1.5 Installing Perl and Additional Modules . . . . . . . . . . . . . . . . . . . . . . 52
1.5.1 Installing Basic Perl . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52
1.5.2 Manual Installation of Perl Modules . . . . . . . . . . . . . . . . . . 52
1.5.3 Automatic Installation of Perl Modules . . . . . . . . . . . . . . . 53
1.5.4 The Required Perl Modules . . . . . . . . . . . . . . . . . . . . . . . . . 54
1.6 Perl Versus Python . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54
1.6.1 Pythons Advantages . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54
1.6.2 Perls Advantages . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56
1.6.3 Efficiency . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57
1.7 GUI Programming with Perl/Tk . . . . . . . . . . . . . . . . . . . . . . . . . . . 59
1.7.1 The First Perl/Tk Encounter . . . . . . . . . . . . . . . . . . . . . . . . 60
1.7.2 The Similarity of Python/Tkinter and Perl/Tk . . . . . . . . 62
1.7.3 Binding Events . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62
1.8 Web Interfaces and CGI Programming . . . . . . . . . . . . . . . . . . . . . . 63
1.8.1 Web Versions of the Scientific Hello World Program . . . . 63
1.8.2 Debugging CGI Scripts in Perl with CGI::Debug . . . . . . . 65
1.8.3 Using Perls CGI Module to Construct Forms . . . . . . . . . 67

2 Introduction to Tcl/Tk . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 70
2.1 A Scientific Hello World Script . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 71
2.1.1 The Complete Code . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 71
2.1.2 Dissection . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 71
2.2 Reading and Writing Data Files . . . . . . . . . . . . . . . . . . . . . . . . . . . . 72
2.2.1 The Complete Code . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 73
2.2.2 Dissection . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 73
2.2.3 Double Quotes, Braces, Brackets, and Variable Substi-
tution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 76
2.3 Automating Simulation and Visualization . . . . . . . . . . . . . . . . . . . 77
2.3.1 The Complete Code . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 77
2.3.2 Dissection . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 79
2.4 Frequently Encountered Tasks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 81
2.4.1 File Reading and Writing . . . . . . . . . . . . . . . . . . . . . . . . . . . 82
2.4.2 Running an Application . . . . . . . . . . . . . . . . . . . . . . . . . . . . 82
2.4.3 List Operations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 83
2.4.4 Associative Array Operations . . . . . . . . . . . . . . . . . . . . . . . 84
2.4.5 Splitting and Joining Text . . . . . . . . . . . . . . . . . . . . . . . . . . 85
2.4.6 Text Processing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 85
2.4.7 String Operations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 86
2.4.8 Numerical Expressions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 87
2.4.9 Environment Variables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 87
2.4.10 Procedures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 88
2.4.11 Listing of Files in a Directory . . . . . . . . . . . . . . . . . . . . . . . 91
2.4.12 Testing File Types . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 91
2.4.13 Copying and Renaming Files . . . . . . . . . . . . . . . . . . . . . . . . 91
2.4.14 Creating and Moving to Directories . . . . . . . . . . . . . . . . . . 91
2.4.15 Removing Files and Directories . . . . . . . . . . . . . . . . . . . . . . 92
2.4.16 Splitting Pathnames . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 92
2.4.17 Traversing Directory Trees . . . . . . . . . . . . . . . . . . . . . . . . . . 92
2.4.18 CPU-Time Measurements . . . . . . . . . . . . . . . . . . . . . . . . . . . 93
2.4.19 Programming with Classes . . . . . . . . . . . . . . . . . . . . . . . . . . 93
2.4.20 Building and Using Packages . . . . . . . . . . . . . . . . . . . . . . . . 93
2.4.21 Regular Expressions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 95
2.4.22 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 95
2.5 GUI Programming with Tcl/Tk . . . . . . . . . . . . . . . . . . . . . . . . . . . . 96
2.5.1 The First Tcl/Tk Encounter . . . . . . . . . . . . . . . . . . . . . . . . 97
2.5.2 Binding Events . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 98
2.5.3 Widget Name Hierarchy . . . . . . . . . . . . . . . . . . . . . . . . . . . . 99
2.5.4 The Similarity of Python/Tkinter and Tcl/Tk . . . . . . . . . 99
2.5.5 Using Variables in Widget Names . . . . . . . . . . . . . . . . . . . . 100
2.5.6 Configuring Widgets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 101
2.5.7 The Grid Geometry Manager . . . . . . . . . . . . . . . . . . . . . . . . 102
Preface

The purpose of this document is to show how the introductory programming


examples from the book Python Scripting for Computational Science [?]
can be implemented in Perl and Tcl. In addition, we list some core func-
tionality of these scripting languages, typically corresponding to the same
information and examples as in Chapter 3Basic Pythonchapter.135 in [?]. If
you know the examples in a Python context from Chapters 2Getting Started
with Python Scriptingchapter.49 and 3Basic Pythonchapter.135 in [?], it is
quite easy to pick up basic Perl and Tcl from the present note. The Perl and
Tcl chapters can be read independently.
The author has a desire to include other scripting languages, e.g., Ruby
and Scheme. Potential authors of such (independent) chapters, with the same
structuring as the Perl and Tcl chapters, are encouraged to drop me an email
(hpl@simula.no).
The present printing of the document contains the Perl part only.
Chapter 1

Introduction to Perl

This chapter gives a quick introduction to the Perl language for readers
who are familiar (at least to some extent) with the Python scripts from
Chapters 2.1A Scientific Hello World Scriptsection.502.3Gluing Stand-Alone
Applicationssection.89 and 3Basic Pythonchapter.135 in the book [?]. We
shall look at the same sample scripts and show how the syntax changes when
we program in Perl.

Recommended Documentation. As a companion to the introductory examples


and the overview of basic Perl functionality provided in this appendix, you
need the Perl man pages. These come along with the Perl distribution. I find it
convenient to read the man pages in plain text format using the perldoc tool.
Some common ways of looking up information with perldoc are exemplified
below.
perldoc perl # overview of all Perl man pages
perldoc perlsub # read about subroutines
perldoc Cwd # look up a special module, here Cwd
perldoc -f open # look up a special function, here open
perldoc -q cgi # seach the FAQ for the text cgi

A Web version of the man pages can be found in the doc.html file. There you
can also find the Perl FAQ and a quick reference.
Having grasped the basic introduction to Perl from this appendix, you
will find the definite Perl reference, the famous Camel book [?], very useful.
However, much of the text in [?] coincides with the Perl man pages. If you feel
that a more comprehensive introduction to Perl is needed, Learning Perl
[?] and [?] are recommended. Ready-made recipes for numerous common
tasks in scripting are collected in the highly recommended Perl Cookbook
[?]. Advanced features of Perl are well discussed in [?] and [?]. Some Web
resources regarding Perl topics are listed in doc.html.
The first Perl encounter consists of three of the examples from the intro-
duction to Python in Chapter 2Getting Started with Python Scriptingchapter.49
in [?]. We start out with a Hello World script, before continuing with a script
concerning file handling and array processing. Thereafter we present a script
gluing a simulation and a visualization program. All these scripts referred to
in this section are found in src/perl. Thereafter, in Chapter 1.4 we list, in an
example-oriented way, some basic and useful Perl functionality for quick refer-
ence. Chapter 1.5 explains how to install Perl and additional modules. A brief
comparison of Perl versus Python appears in Chapter 1.6, while Chapters 1.7
2 1. Introduction to Perl

and 1.8 deal with graphical user interfaces: standard GUIs and dynamic Web
pages, respectively.

1.1 A Scientific Hello World Script


Our first look at Perl will be the Scientific Hello World script from Chap-
ter 2.1A Scientific Hello World Scriptsection.50 in [?]. This script reads a real
number from the command line, takes the sine of the number, and writes
Hello, World! sin(r)=s with the appropriate values of the numbers r and s.
In Perl, we can write the script like this:
#!/usr/bin/perl
$r = $ARGV[0]; # fetch the first ([0]) command-line argument
$s = sin($r); # compute sin(r) and store in variable s
print "Hello, World! sin($r)=$s\n"; # print to standard output

Comments in Perl start with # and continue for the rest of the line. However,
the first line #/usr/bin/perl! has a special meaning: Under Unix it tells that
the script, if run as an executable file, is to be interpreted by the program
/usr/bin/perl. If the executable Perl interpreter is stored in another path on
your system, you must write the correct full path in the top line of the script
or (usually better) use a different header to be presented in Chapter 1.1.1.
Scalar variables in Perl are always preceded by a $ sign, i.e., $r and $s are
scalar variables in the present script. The command-line arguments to a Perl
script are automatically stored in the array ARGV. Subscripting this array is
done as in $ARGV[0] (which implies extracting the first entry; arrays in Perl
start with 0 as in C and Python). The length of the array is $#ARGV+1, i.e.,
$ARGV[$#ARGV] is the last entry of the array. The array itself as a variable is
reached with the syntax @ARGV (and one can say, e.g., print "ARGV=@ARGV").
Variables can be directly inserted into a text string, a convenient feature
called variable interpolation:
print "Hello, World! sin($r)=$s\n"; # print to screen

Such variable interpolation works only if the string is surrounded by double


quotes. Single quotes just leads to output of text with dollar characters.
Perls syntax is much inspired by C. For example, the newline character
is \n and all statements are terminated by a semicolon.
As usual in scripting, variables are never declared; the context determines
the type. Contrary to Python, a variable can be used both as a string and
a floating-point number. For example, $r is initialized to a text, but can be
sent to the sine function, which expects a floating-point variable, without any
explicit type conversion.
Perls printf function gives good control of the output format of numbers
and strings:
printf "Hello, World! sin(%g)=%12.5e\n", $r, $s;
1.1. A Scientific Hello World Script 3

There is no possibility to control the format when using variable interpolation


(i.e., Pythons %(s)12.5e is not supported).
If the script is stored in a file hw.pl, you can execute the script by typing
perl hw.pl 0.1

or you can make the file executable under Unix (chmod a+x hw.pl) and then
just write
./hw.pl 0.1

1.1.1 Reading and Writing Data Files


Chapter 2.2Working with Files and Datasection.59 in [?] deals with a script
for reading a file with (x, y) data points in two columns and writing a new
two-column file with transformed data points (x, f (y)). On the next pages
we shall present and explain a Perl counterpart to the Python scripts. This
case study demonstrates how to work with files, subroutines, and arrays in
Perl.

1.1.2 The Complete Code


: # *-*-perl-*-*
eval exec perl -w -S $0 ${1+"$@"}
if 0; # if running under some shell

die "Usage: $0 infilename outfilename\n" if $#ARGV < 1;

($infilename, $outfilename) = @ARGV;

open(INFILE, "<$infilename"); # open for reading


open(OUTFILE, ">$outfilename"); # open for writing

# read one line at a time:


while (defined($line=<INFILE>)) {
($x, $y) = split( , $line); # extract x and y value
$fy = myfunc($y); # transform y value
printf(OUTFILE "%g %12.5e\n", $x, $fy);
}
close(INFILE); close(OUTFILE);

sub myfunc {
my ($y) = @_;
if ($y >= 0.0) { return $y**5.0*exp(-$y); }
else { return 0.0; }
}

This script is stored in src/perl/datatrans1.pl.


4 1. Introduction to Perl

1.1.3 Dissection
The Perl script starts with a header
: # *-*-perl-*-*
eval exec perl -w -S $0 ${1+"$@"}
if 0; # if running under some shell

This header ensures that executing the script as


./datatrans1.pl infile outfile

implies interpreting the code by the first perl program encountered in the
directories listed in your PATH environment variable. The explanation of all
the details in our Perl header is intricate, but it can be found in the file
src/perl/headerfun.sh. (This is actually a document written in Bash (!) so
you need to run the file to get the document printed.)
In the case where the user has failed to provide two command-line ar-
guments, we want to write a usage message and abort the script. This is
accomplished by Perls die statement: die prints a string on standard error
and terminates the script. In the present example the script dies if there are
less than two command-line arguments:
die "Usage: $0 infilename outfilename\n" if $#ARGV < 1;

Recall that $#ARGV is the last legal index in @ARGV, i.e., the length of @ARGV is
$#ARGV+1, so the test is $#ARGV+1 < 2, leading to $#ARGV < 1.
Extracting the first two command-line arguments can be performed by
standard subscripting:
$infilename = $ARGV[0];
$outfilename = $ARGV[1];

However, it is more common (and elegant) to use Perls list assignment con-
struction:
($infilename, $outfilename) = @ARGV;

The list on the left-hand side is set equal, entry by entry, to the entries in
the array on the right-hand side. We refer to the remark at the end of this
section for an explanation of the difference between list and array in Perl
terminology.
Opening files in Perl is done with the open function:
open(INFILE, "<$infilename"); # open for reading
open(OUTFILE, ">$outfilename"); # open for writing
1.1. A Scientific Hello World Script 5

The first argument to open is a file handle, which is used for accessing the
file in the Perl code. Input files are recognized by < in front of the name1 , >
signifies an output file, and >> implies that text will be appended to the file.
Reading from a file handle, line by line, is accomplished by
while (defined($line=<INFILE>)) {
# process $line
}

In the present script we want to split the line into an array of words, separated
by whitespace. The split function performs this task:
($x, $y) = split( , $line); # extract x and y value

Having the coordinates $x and $y available, we can transform the y value by


calling a function myfunc,
$fy = myfunc($y); # transform y value

One way of printing the transformed coordinate pair to the output file is to
apply the printf function:
printf(OUTFILE "%g %12.5e\n", $x, $fy);

The core of a printf call is the format string, which follows the same syntax as
in C and Python (and all other languages that supports the Cs printf style
for formatting). Perls ordinary print function can also be used for writing
to files, e.g., print OUTFILE "$x $fy\n";
The myfunc function is defined as
sub myfunc {
my ($y) = @_;
if ($y >= 0.0) { return y**5.0*exp(-$y); }
else { return 0.0; }
}

Functions are referred to as subroutines in Perl. Their look is typically


sub name {
# all subroutine arguments are stored in the array @_
...
return ...
}

The most striking difference from subprograms in other languages is that the
argument list is not a part of the subroutine heading. Instead, all arguments
are available in an array @_. The first step is normally to store the arguments
in local variables:
1
If there is no < symbol, the file is opened for reading. In fact,
opentt(F,"<$name"), open(F,"$name"), and open(F,$name) all lead to open-
ing a file a file with name $name.
6 1. Introduction to Perl

my ($y) = @_; # list assignment


# or
my $y = @_[0]; # subscripting

The my keyword tells that all variables on the left-hand side are declared as
local variables in the subroutine. This is a good habit as using unintended
global variables inside a subroutine may have undesired effects in other parts
of the script.
As in Chapter 2.2Working with Files and Datasection.59 in [?], we can
modify datatrans1.pl such that (i) the file is loaded into an array of lines,
(ii) the x and y coordinates are stored in two arrays, and (iii) the output file
is written by a for loop over the array entries.
We start with making the open statement a bit more robust. Perl does not
by default write any error message if the file we try to open does not exist.
This can be quite annoying, but the problem is solved by a try something
or die construction:
open(INFILE, "<$infilename")
or die "unsuccessful opening of $infilename; $!\n";

The $! variable is a special variable in Perl containing the last error message
issued by the operating system.
Loading a file into an array of lines is enabled by the syntax
@lines = <INFILE>;

One can then process the array @lines, line by line:


for $line (@lines) {
# process $line
}

# equivalent syntax:
foreach $line (@lines) {
# process $line
}

In the present case we want to create two arrays, @x and @y, containing the
x and y coordinates:
@x = (); @y = (); # start with empty arrays
for $line (@lines) {
($xval, $yval) = split( , $line);
push(@x, $xval); push(@y, $yval);
}

The x and y coordinates are extracted by splitting the line with respect to
whitespace, exactly as we did in the datatrans1.pl code. The push function
appends new array entries.
Creating the output file can now be performed by a C-like for loop over
the array indices:
1.1. A Scientific Hello World Script 7

open(OUTFILE, ">$outfilename")
or die "unsuccessful opening of $outfilename; $!\n";

for ($i = 0; $i <= $#x; $i++) {


$fy = myfunc($y[$i]); # transform y value
printf(OUTFILE "%g %12.5e\n", $x[$i], $fy);
}
close(OUTFILE);

Recall that $#x is the last valid index in the array @x. The complete code is
found in src/perl/datatrans2.pl.

Remarks on Terminology. Perl distinguishes between the terms array and


list. Roughly speaking, an array is the variable having a list as value [?,
Ch. 4.0]. For example, in an assignment @a = ("a","b","c"), a is an array,
whereas its value ("a","b","c") is a list. The function push operates on ar-
ray variables and not on lists, meaning that push(@a,"q") works well, while
push(("a","b","c"),"q") does not make sense.

1.1.4 The Concept of Context in Perl


Operations in Perl er evaluated in a specific context. For newcomers to the
language the context concept can be quite confusing. A thorough explanation
of context is provided in the Camel book [?, Ch. 2] or the perldata man
page (invoke perldoc perldata and search for Context). Here we shall only
exemplify the two major contexts: scalar and list. The assignment
@a = ("a","b","c");

evaluates the list on the right-hand side in a list context, and @a becomes
an array variable having its entries equal to the three scalars in the list
("a","b","c"). When assigning the list to a scalar,
$a = ("a","b","c");

the list on the right-hand side is evaluated in a scalar context. In this case,
the value of the list is the value of the last element (as with the C comma
operator). Therefore, $a becomes "c". On the other hand,
$b = @a;

evaluates the array variable @a in a scalar context, and its value is then the
length of the array. That is, $b becomes 3.
These examples show that an array variable can have a list as value in
a list context and its length as value in a scalar context. A hash evaluated
in a scalar context becomes true if there are elements in the hash, and false
otherwise2 .
The property that an array evaluates to its length in a scalar context is
often taken advantage of by Perl programmers. Two common applications
are
2
There is more information in the scalar value, see the perldata man page.
8 1. Introduction to Perl

for ($i = 0; $i < @a; $i++) {


# work with $a[$i] ...
}

die "Usage: $0 file" unless @ARGV;


die "Usage: $0 -f file" unless @ARGV == 2;

Especially the two latter examples have an attractive readability.


The return value of many Perl functions depends on the context. One
example is localtime:
$t = localtime();

yields the date as a string; $t is "Sun May 13 09:02:27 2001", for instance. In
a list context,
@t = localtime();

localtime returns a list of nine values containing the time, day, month, year,
etc. (see perldoc -f localtime), and @t becomes an array of numbers (say)
(27, 2, 9, 13, 4, 101, 0, 132, 1).

1.2 Automating Simulation and Visualization


Chapter 2.3Gluing Stand-Alone Applicationssection.89 in [?] describes a sim-
ple simulation code, called oscillator, for solving a differential equation mod-
eling an oscillating system. Using a script, we can improve the user friendli-
ness of the simulation code and also launch a visualization of the solution. A
Python version of such a script is explained in detail in Chapter 2.3Gluing
Stand-Alone Applicationssection.89 in [?], and the purpose of the present
section is to present the Perl version of that script.

1.2.1 The Complete Code


: # *-*-perl-*-*
eval exec perl -w -S $0 ${1+"$@"}
if 0; # if running under some shell

# default values of input parameters:


$m = 1.0; $b = 0.7; $c = 5.0; $func = "y"; $A = 5.0;
$w = 2*3.14159; $y0 = 0.2; $tstop = 30.0; $dt = 0.05;
$case = "tmp1"; $screenplot = 1;

# read variables from the command line, one by one:


while (@ARGV) {
$option = shift @ARGV; # load cmd-line arg into $option
if ($option eq "-m") {
$m = shift @ARGV; # load next command-line arg
}
elsif ($option eq "-b") { $b = shift @ARGV; }
elsif ($option eq "-c") { $c = shift @ARGV; }
1.2. Automating Simulation and Visualization 9

elsif ($option eq "-func") { $func = shift @ARGV; }


elsif ($option eq "-A") { $A = shift @ARGV; }
elsif ($option eq "-w") { $w = shift @ARGV; }
elsif ($option eq "-y0") { $y0 = shift @ARGV; }
elsif ($option eq "-tstop") { $tstop = shift @ARGV; }
elsif ($option eq "-dt") { $dt = shift @ARGV; }
elsif ($option eq "-noscreenplot") { $screenplot = 0; }
elsif ($option eq "-case") { $case = shift @ARGV; }
else {
die "$0: invalid option $option\n";
}
}

# create a subdirectory with name equal to case and generate


# all files in this subdirectory:
$dir = $case;
use File::Path; # contains the rmtree function
if (-d $dir) { # does $dir exist?
rmtree($dir); # remove directory (old files)
}
mkdir($dir, 0755) or die "Could not create $dir; $!\n";
chdir($dir) or die "Could not move to $dir; $!\n";

# make input file to the program:


open(F,">$case.i") or die "open error; $!\n";
print F "
$m
$b
$c
$func
$A
$w
$y0
$tstop
$dt
";
close(F);

# run simulator:
$cmd = "oscillator < $case.i"; # command to run
$failure = system($cmd);
die "running the oscillator code failed\n" if $failure;

# make gnuplot script:


open(F, ">$case.gnuplot");
print F "
set title $case: m=$m b=$b c=$c f(y)=$func A=$A w=$w y0=$y0 dt=$dt;
";
if ($screenplot) {
print F "plot sim.dat title y(t) with lines;\n";
}
print F <<EOF; # print multiple lines using a "here document"
set size ratio 0.3 1.5, 1.0;
# define the postscript output format:
set term postscript eps monochrome dashed Times-Roman 28;
# output file containing the plot:
set output $case.ps;
10 1. Introduction to Perl

# basic plot command:


plot sim.dat title y(t) with lines;
# make a plot in PNG format as well:
set term png small;
set output $case.png;
plot sim.dat title y(t) with lines;
EOF
close(F);
# make plot:
$cmd = "gnuplot -geometry 800x200 -persist $case.gnuplot";
$failure = system($cmd);
die "running gnuplot failed\n" if $failure;

The complete source code appears in src/perl/simviz1.pl.

1.2.2 Dissection
The script starts with a safe Perl header, which ensures interpretation of
the script by the first Perl interpreter found in the users path. After having
assigned default values to the input parameters to the oscillator code, we
encounter an important part of many scripts, namely parsing of command-
line arguments. The idea is that we eat the entries in @ARGV one by one
using the shift operator:
$option = shift @ARGV;

This statement implies setting $options equal to the first element in @ARGV
and then removing this element from @ARGV3 . We search for options on the
command line until the @ARGV array is empty:
while (@ARGV) { # while @ARGV is non-empty
$option = shift @ARGV; # load command-line arg. into $option
if ($option eq "-m") {
$m = shift @ARGV; # load next command-line arg
}
elsif ($option eq "-b") { $b = shift @ARGV; }
...
else {
die "$0: invalid option $option\n";
}
}

As an alternative to this explicit grabbing of command-line arguments, we


can use a special Perl utility called GetOptions [?, p. 445]:
use Getopt::Long; # load module with GetOptions function
GetOptions("m=f" => \$m, "b=f" => \$b, "c=f" => \$c,
"func=s" => \$func, "A=f" => \$A, "w=f" => \$w,
"y0=f" => \$y0, "tstop=f" => \$tstop,
"dt=f" => \$dt, "case=f" => \$case,
"screenplot!" => \$screenplot);
3
Experienced Perl programers will often write just $options = shift; because
shift without arguments implies shifting @ARGV. More examples regarding such
shortcuts in Perl are provided in Chapter 1.3.
1.2. Automating Simulation and Visualization 11

The syntax m=f means searching for the command-line argument --m and
loading the proceding argument as a floating-point number (=f) into the Perl
variable $m. A single hyphen as in -m works too. Similarly, func=s specifies
--func to take a string argument. The specification of the flag screenplot
allows us to use either --screenplot for setting $screenplot to a true value
or --noscreenplot for setting $screenplot to a false value (note to get this
on/off behavior, the exclamation mark is required in "screenplot = $screen-
plot!). The GetOptions function has a rich functionality; the purpose here just
is to notify the reader about the existence of such a handy function. Instruc-
tive information is obtained from perldoc Getopt::Long. There are several
other modules in the Getopt family. For example: Getopt::Simple for a sim-
plified interface to Getopt::Long, Getopt::Std for single-character options,
Getopt::Mixed for long and single-character options, and Getopt::Declare for
handling command-line options or configuration files with associated help
text and initialization code.
The next step in our script is to move to the prescribed directory. However,
we should first check whether the directory exists, and if so, we should delete it
and recreate it to avoid mismatch between old and new result files. Checking
if a directory exists is done by the command if (-d $directoryname) in Perl.
Removing a non-empty directory can be conveniently done by first loading an
external Perl module, use File::Path, and then calling the function rmtree
in that module:
use File::Path; # has the rmtree function
if (-d $dir) { # does $dir exist?
rmtree($dir); # remove directory (old files)
}
mkdir($dir, 0755) or die "Could not create $dir; $!\n";
chdir($dir) or die "Could not move to $dir; $!\n";

Observe that we test for success of mkdir. For example, insufficient permission
to create a new directory will not be noticable when running the script unless
we include the or die statement4 .
The next task is to write an input file for the oscillator program. Multi-
line output can easily be created through an ordinary string with embedded
newlines5
print F "
$m
$b
$c
$func
$A
$w
$y0
4
Python will in such cases abort the script and write a Permission denied mes-
sage to standard output. See Exercise 1.8.
5
Python requires a triple quoted string for this purpose.
12 1. Introduction to Perl

$tstop
$dt
";

Alternatively, we can use a special Perl construction (stemming from Unix


shells), known as a here document:
print F <<EOF;
$m
$b
$c
$func
$A
$w
$y0
$tstop
$dt
EOF

Everything between the two EOF marks is treated as output text. The enclos-
ing EOF must start in the first column of the script file. The Gnuplot script
later in the simviz1.pl code is actually written as a here document.
Perls system function is used for running applications:
$cmd = "oscillator < $case.i"; # command to run
$failure = system($cmd);
die "running the oscillator code failed\n" if $failure;

Visualization of the solution in Gnuplot requires writing a small script with


the proper Gnuplot commands:
open(F, ">$case.gnuplot");
print F <<EOF; # print multiple lines using a "here document"
...
# output file containing the plot:
set output $case.ps; # variable interpolation
...
EOF
close(F);

# make plot:
$failure = system("gnuplot $case.gnuplot");
die "running gnuplot failed\n" if $failure;

Never forget to close files before continuing with system commands involving
the generated files!

1.3 Theres More Than One Way To Do It


A famous Perl slogan is Theres More Than One Way To Do It (often ab-
breviated TIMTOWTDI, pronounced Tim Toady). The goal of the present
1.3. Theres More Than One Way To Do It 13

section is to exemplify this slogan and demonstrate different Perl program-


ming styles. We shall develop scripts for finding files containing a specified
string and show that there might be many different Perl solutions to a pro-
gramming problem.
When working with computers, you have probably often tried to find a file
containing some particular text, but you have a hard time figuring out what
the filename is. If you remember parts of the text, the Unix grep command
is handy. For example,
grep superLibFunc *

searches all files (*) in the current working directory for the text string
superLibFunc and writes out the matches. This can help you finding the file
you are looking for. We shall present a cross-platform Perl script, which im-
plements the grep functionality.

1.3.1 A Script for Perl Beginners


A verbose, easy-to-read grep script in Perl can take the following form.
: # *-*-perl-*-*
eval exec perl -w -S $0 ${1+"$@"}
if 0; # if running under some shell

die "Usage: $0 pattern file1 file2 ...\n" if $#ARGV < 1;

# first command-line argument is the pattern to search for:


$pattern = shift @ARGV;
# run through the next command-line arguments, i.e. files, and grep:
while (@ARGV) {
$file = shift @ARGV;
if (-f $file) {
open(FILE,"<$file");
@lines = <FILE>; # read all lines
foreach $line (@lines) {
if ($line =~ /$pattern/) {
print "$file: $line";
}
}
close(FILE);
}
}

The only new statement here is


if ($line =~ /$string/)

which is a test whether the variable $line matches the regular expression
contained in $string. If so, we write out this line.
14 1. Introduction to Perl

1.3.2 Using the Underscore Variable


The Perl program can be written more compactly using the implicit $_ vari-
able. Let us present the code first and the explain what the syntax means.
#!/usr/bin/perl
die "Usage: $0 pattern file1 file2 ...\n" unless @ARGV >= 2;
($pattern, @files) = @ARGV;
foreach (@files) {
if (-f) {
open(FILE,"<$_");
foreach (<FILE>) {
if (/$pattern/) {
print;
}
}
close(FILE);
}
}

The extraction of command-line arguments is elegantly performed by divid-


ing the arguments into the leading search string and an array holding the
filenames:
($pattern, @files) = @ARGV;

Many Perl commands can be issued without an explicit variable to work


with. One example is foreach (@files). In such cases the invisible variable
is $_. That is, foreach (@files) actually means foreach $_ (@files).
The previous code is best explained by showing the equivalent Perl state-
ments where the $_ appears explicitly:
#!/usr/bin/perl
die "Usage: $0 pattern file1 file2 ...\n" unless @ARGV >= 2;
($pattern, @files) = @ARGV;
foreach $_(@files) {
if (-f $_) {
open(FILE,"<$_");
foreach $_ (<FILE>) {
if ($_ =~ /$pattern/) {
print $_;
}
}
close(FILE);
}
}

1.3.3 A Script Written in Typical Perl Style


A more modern Perl style could be introduced in the script that makes use
of the implicit $_ variable:
1.3. Theres More Than One Way To Do It 15

#!/usr/bin/perl
die "Usage: $0 pattern file1 file2 ...\n" unless @ARGV >= 2;
($pattern, @files) = @ARGV;
foreach (@files) {
next unless -f;
open FILE, $_;
foreach (<FILE>) { print if /$pattern/; }
close FILE;
}

The next unless -f statement means that one jumps to the next iteration in
the loop unless the test if (-f $_) is true, i.e., unless the current filename
($_) is an existing file.

1.3.4 Shorter Scripts for Lazy Programmers


There are many shortcuts in Perl aimed at lazy programmers. Here is an
example of a grep script equivalent to those above, but with a much more
compact file reading construction:
#!/usr/bin/perl
$pattern = shift; # shift; means shift @ARGV
while (<>) { # read line by line in file by file
print if /$pattern/o; # o increases the efficiency
}

The while (<>) loop implies reading all lines in all files whose names are
in @ARGV. (If there are no filenames on the command line, <> reads from
standard input.) Since processing a list of files in a line-oriented fashion is
a frequently encountered task in scripts, while (<>) is a popular and widely
used construction that saves quite some typing. It goes without saying that
each line is available in the $_ variable.

1.3.5 The Ultimate Goal: Getting Rid of the Script File


We can also do the grep operation with a command-line Perl script:
perl -n -e print if /superLibFunc/; file1 file2 file3

Here, the -n option tells Perl to invoke a loop over all lines in all files specified
on the command line (equivalent to while (<>)) and execute the string after
-e as a Perl script applied to each line. Implicit here is that the line is stored
in the $_ variable.

1.3.6 Perl Has a Grep Function Too


The grep operation is so common that Perl has in fact a built-in grep function:
16 1. Introduction to Perl

#!/usr/bin/perl
die "Usage: $0 pattern file1 file2 ...\n" unless @ARGV >= 2;
($pattern, @files) = @ARGV;
foreach $file (@files) {
if (-f $file) {
open FILE, "<$file";
@lines = <FILE>;
@match = grep /$pattern/, @lines; # Perl grep
print "$file: @match";
close FILE;
}
}

The grep function searches for $string in a list of all the lines in the file and
returns a list with the lines that contain $string. Of course, this readable
script can be condensed to two lines if desired, using the <> notation:
#!/usr/bin/perl
$pattern = shift;
print grep /$pattern/, <>;

Observe that we here do not easily print the filename.

Remark. We should mention that reading the whole file into memory at once,
which is implied by @lines=<FILE> and also the <> operator, may face memory
problems if you work with large data files. The line-by-line reading can then
be more appropriate.

Exercise 1.1. Modify a very Perl-ish grep script.


Consider a grep script in typical modern Perl style:
#!/usr/bin/perl
die "Usage: $0 pattern file1 file2 ...\n" unless @ARGV >= 2;
($pattern, @files) = @ARGV;
foreach (@files) {
next unless -f;
open FILE, $_;
foreach (<FILE>) { print if /$pattern/; }
close FILE;
}

Extend this script such that the filename and the line number are printed
at the beginning of the lines that match the given string. You can count
the number of lines in the last foreach loop, or you can make use of Perls
special variable $., which holds the line number of the current line. Write
the line number in a field of width (say) 5 characters such that the out-
put is nicely aligned in three colums (filename, line number, line), see Ex-
ercise 8.4Exercisesexercise.483 on page 349Exercisesexercise.483 in [?] for a
sample output.
Observe how simple such an extension would have been if we had used
named variables instead of $_, or in other words, readability and extendability
are seldom well supported by extensive use of $_. 
1.4. Frequently Encountered Tasks 17

1.4 Frequently Encountered Tasks

Frequently encountered tasks in Perl scripts have been collected and orga-
nized in the present section, with the aim of providing a kind of example-
oriented quick reference for the reader. The following tasks are covered:
basic control structures,
file reading and writing,
(multi-line) output with format control,
executing other programs,
working with arrays and hashes,
splitting, joining, searching, and replacing text,
writing and calling Perl subroutines,
checking a files type, size, and age,
listing and removing files,
creating and removing directories,
moving to directories, traversing directory trees,
measuring CPU time,
building simple Perl modules,
working with regular expressions.

1.4.1 Basic Control Statements


A typical if-else test follows this syntax:
if ($answer eq "copy") {
$copy = 1;
} elsif ($answer == 0) {
$quit = 1;
} elsif { $answer eq run or answer eq execute) {
$run = 1;
} else {
print Invalid answer $answer\n;
}

Perl has numerous ways of writing if tests. Some examples are


if ($pen ne "up") { $pen = "up"; }
if (not $pen eq "up") { $pen = "up"; }
if (! $pen eq "up") { $pen = "up"; }
$pen = "up" if $pen ne "up";
$pen = "up" if not $pen eq "up";
$pen = "up" if ! ($pen eq "up");

The for or foreach statement visits the entries in an array, entry by entry:
18 1. Introduction to Perl

# convert some PostScript files to GIF:


@somelist = (file1.ps, file2.ps, file3.ps);
for $psfile (@somelist) {
$giffile = $psfile; $giffile ~ s/\.ps/.gif;
system("convert ps:$psfile gif:$giffile");
}

There is both a while loop and a do-while loop in Perl:


$r = 0; $dr = 0.1;
while (r <= 10) {
$s = sin($r); print "$s\n";
$r += $dr;
}

$r = 0; $dr = 0.1;
do {
$s = sin($r); print "$s\n";
$r += $dr;
} while ($r <= 10);

The last statements breaks out of a loop:


for $line (@list_of_lines) {
last if line[:5] =~ /^% set/;
}

The next statement continues with the next iteration in the loop:
# print lines not starting with #:
for $file (@files) {
next if not -f $file; # continue with next file
# process $file:
...
}

1.4.2 File Reading and Writing


The following code segments demonstrate opening a file and reading it line
by line or loading it into a list of lines:
$infilename = "myprog.cpp";
open(INFILE, "<$infilename") # open for reading
or die "Cannot read file $infilename; $!\n";
@lines = <INFILE>; # load file into a list of lines

# alternative reading, line by line:


while (defined($line = <INFILE>)) {
# process $line
}

# quicker variant, using $_:


while (<INFILE>) {
# process current line, stored in $_
}
close(INFILE);
1.4. Frequently Encountered Tasks 19

The recipe for opening a file for writing a list of lines is given next.
$outfilename = "myprog2.cpp";
open(OUTFILE, ">$outfilename") # open for writing
or die "Cannot write to file $outfilename; $!\n";
$line_no = 0; # count the line number in @lines
foreach $line (@lines) {
$line_no++;
print OUTFILE "$line_no: $line";
}
close(OUTFILE);

We can proceed with appending text to a file, using Perls features for writing
(large) blocks of text in one output statement, with embedded variables if
desired:
open(OUTFILE, ">>$filename") # open for appending
or die "Cannot append to file $filename; $!\n";
# print multiple lines at once, using a here document:
print OUTFILE <<EOF;
/*
This file, "$outfilename", is a version
of "$infilename" where each line is numbered.
*/
EOF

# equivalent output using a string instead:


print OUTFILE \
"/*
This file, \"$outfilename\", is a version
of \"$infilename\" where each line is numbered.
*/";

close(OUTFILE);

If you need to treat a file handle, such as OUTFILE, like a variable, e.g.,
when sending it to a function, you should use Perls FileHandle objects, see
perldoc FileHandle.

1.4.3 Running an Application


Any operating system command can be executed by calling the system func-
tion. Here is an example involving running an application myprog:
$cmd = "myprog -c file.1 -p -f -q";
$failure = system("$cmd > res"); # output goes to file res
die "$0: running $cmd failed\n" if $failure;

A different way of testing for failure is


system("$cmd > res") == 0 or die "$0: running $cmd failed\n";

The return value from system is also available in the special Perl variable $?:
20 1. Introduction to Perl

system("$cmd > res");


die "$0: running $cmd failed\n" if $?;

To redirect the output from the application into a list of lines, one can
use back quotes:
$cmd = "myprog -c file.1 -p -f -q";
@res = $cmd;

Alternatively, one can open a pipe to the application and read the output as
if it were a file:
open(APP, "$cmd |");
@res = <APP>;

# alternative line by line reading:


open(APP, "$cmd |");
while (<APP>) {
# process the current line, stored in $_
}
close(APP);

Pipes can also be used for running interactive applications. After having
opened a write pipe to a program, we can issue various commands, which are
executed upon closing the pipe. Here is an example involving the interactive
Gnuplot program:
open (GNUPLOT, "| gnuplot -persist"); # open a pipe to Gnuplot
print GNUPLOT "set xrange [0:10]; set yrange[-2:2]\n";
print GNUPLOT "plot sin(x)\n"; # draw a sine function
print GNUPLOT "quit\n";
close(GNUPLOT); # run Gnuplot with the commands

1.4.4 One-Line Perl Scripts


Perl supports some command-line options for wrapping a script with a loop
over all lines in a series of files. This is very convenient for creating one-line
scripts on the fly. For example,
perl -p -i.bak -e ... file1 file2 file3

runs a loop over all lines in file1, file2, and file3. For each line, the Perl
commands provided inside the quotes (after the -e option) are executed,
and the -p option implies that the line is printed after execution of the
commands. Without the -i option the printing goes to standard output, but
with -i the files are modified in-place, i.e., the original file is replaced by the
new output. With -i.bak the file file1 is first copied to file1.bak before it
is being overwritten. The -p and -i.bak options are normally combined into
-pi.bak. Each line in the files is stored in $_. As an illustration we can let the
script specified by the -e option be s/float/double/g; meaning that float is
replaced by double in some files (here file1, file2, and file3):
1.4. Frequently Encountered Tasks 21

perl -pi.bak -e s/float/double/g; file1 file2 file3

To avoid automatic printing of each line, we can replace the -p option by


-n. Suppose a data file has numbers in a series of columns, separated by
whitespace, and you want to extract the first and the fourth column. The
relevant one-liner is then
perl -ne @s=split; print "$s[0]\t$s[3]\n" datafile

Calling split without an argument implies splitting $_ with respect to whites-


pace. The equivalent Perl script, stored in a file, in this latter example can
also be made very short:
while (<>) {@s=split; print "$s[0]\t$s[3]\n";}

1.4.5 Array and List Operations


The most common statements for creating and traversing arrays are listed
next. Creating an array with three entries goes like this:
@arglist = ($myarg1, "displacement", "tmp.ps");

We can use an array as an entry too,


@arr = ($var1, $var2);
@arglist = ($myarg1, "displacement", @arr, "tmp.ps");

but @arglist does not have an array as the third element; the @arr arrays
entries are simply inserted in @arglist, i.e., @arglist now contains
($myarg1, "displacement", $var1, $var2, "tmp.ps");

To force the third entry to be the @arr array, this entry must be a reference
to @arr, obtained by prefixing @arr with a backslash (see page 29):
@arglist = ($myarg1, "displacement", \@arr, "tmp.ps");

New entries can be appended to an array using the push function, e.g.,
push(@arglist, $myvar2);
push(@arglist, @arr2);

Changing entries is enabled by subscripting, e.g.,


$arglist[2] = "displacement";

Traversing lists applies the for or foreach loop,


22 1. Introduction to Perl

foreach $entry (@arglist) {


print "entry is $entry\n";
}
# or
for $entry (@arglist) {
print "entry is $entry\n";
}

Index-based traversal is also possible:


for ($i = 0; $i <= $#arglist; $i++) {
print "entry is $arglist[$i]\n";
}
# or
for ($i = 0; $i < @arglist; $i++) {
print "entry is $arglist[$i]\n";
}

A widely used shortcut for creating a list of strings is the qw operator:


@strlist = qw/item1 item2 item3/;
# equivalent to:
@strlist = ("item1", "item2", "item3");

The qw operator is frequently used in Perl/Tk programming.


Extracting entries from an array is often performed by a list assignment,
e.g.,
($filename, $plottitle, $psfile) = @arglist;

This assignment works regardless of the length of @arglist6 . If @arglist has


(say) two elements, $psfile becomes an undefined variable. The final list
entry on the left-hand side can be a list, e.g.,
($filename, $plottitle, @rest) = @arglist;
# @rest becomes $arglist[2], $arglist[3] and so on

The shift function returns and removes the first array element:
$first_entry = shift @arglist;

The pop function returns and removes the last array element:
$last_entry = pop @arglist;

Without arguments, shift and pop works on @ARGV in the main program and
@_ in subroutines, e.g.,
6
Similar list assignments in Python requires that the lists on each side of the
assignment operator have equal lengths.
1.4. Frequently Encountered Tasks 23

$file = shift; # same as shift @ARGV;

sub myroutine {
my $arg1 = shift; # same as shift @_;
my $arg2 = shift;
...
}

Array items can be changed in-place:


# @A is some array of numbers
for ($i=0; $i<=$#A; $i++) {
if ($A[$i] < 0.0) { $A[$i] = 0.0; }
}
# @A does not contain negative numbers

The follwing construction also works, i.e., entries in @A are changed7 :


for $r (@A) {
if ($r < 0.0) { $r = 0.0; }
}

Perl arrays allow slicing: @arglist[1..3] returns the second up to and


including the fourth entry, that is, 1..3 denotes the indices 1-3.
Unlike Python, an array assignment like
@a = @b;

creates a new array @a where each element is a copy of the corresponding array
element in @b. To make a refer to the array b, as in the Python assignment a
= b, we need to let a be a reference:

$a = \@b;

See page 29 for more information about references and how to access the
values referred to by $a.
Reversing the order of the entries in an array is performed by the reverse
function:
@reversed_strlist = reverse(@strlist);

Sorting an array is also easy:


@sortedl_ist = sort(@list); # sort in ascending ASCII order

The sort order can be controlled by a user-defined function, e.g.,


7
The similar construction does not work in Python (cf. the example starting on
page 87Lists and Tuplessubsection.148 in [?]).
24 1. Introduction to Perl

sub numeric_sort {
if ($a < $b) { return -1; }
elsif ($a == $b) { return 0; }
else { return 1; }
}
@sorted_list = sort numeric_sort @list;

The arguments $a and $b in sort criteria routines are automatically initialized


by Perl and used instead of the @_ array for speed. The numeric_sort routine
is often required, but writing a separate subroutine is actually not necessary
because Perl already has a compound comparison operator <=> that works
with numbers:
@sorted_list = sort { $a <=> $b } @list; # numeric sort

The statement $a <=> $b evalues to 1, 0 or 1, depending on whether $a is


less than, equal to, or greater than $b, respectively. The operator works for
text too. We refer to the description of the sort function in perldoc perlfunc
(or write just perldoc -f sort) for numerous examples on writing customized
sort functions, e.g., case-insensitive text comparison.
The perlfunc man page is very useful; if you wonder about the Perl func-
tion name for doing a specific task, write perldoc perlfunc and search for
keywords in this man page.

1.4.6 Hash Operations


A hash, also known as associative array in other languages, or dictionary in
Python, is a kind of array where the index, called key, can be an arbitrary
text. For example, all command-line options to a script could be stored in a
hash with the name of the option (without any hyphens) as key:
$cmlargs{m} = 1.2; # or $cmlargs{m} = 1.2;
$cmlargs{tstop} = 6.0; # or $cmlargs{tstop} = 6.0;

This allows for easy processing of a large number of command-line arguments


and corresponding script variables. Here is a possible code segment:
# init the entire hash with default values:
# (the entire hash is preceded by %)
%cmlargs = (
tstop => 6.0,
m => 1.2
);
while (@ARGV) { # run through all command-line arguments
$option = shift @ARGV;
$option = substr($option, 2); # strip off hyphens (--)
if (exists($cmlargs{$option})) {
# next command-line argument is the value:
$value = shift @ARGV
$cmlargs{$option} = $value;
} else {
1.4. Frequently Encountered Tasks 25

die "The option $option is not registered\n";


}
}
# traverse the hash structure, key by key:
foreach $option (keys %cmlargs)
{ print "cmlargs{$option}=$cmlargs{$option}\n"; }

With this technique you could develop various tools for initializing and pro-
cessing command-line options, and each time you need to add a new variable
and a corresponding option to the script, you can simply add one new line
to the initialization of the default values in the hash cmlargs.

1.4.7 Splitting and Joining Text


The split function splits a string according to a delimiter string or a regular
expression. A common use of split is to split a text into words:
$files = "case1.ps case2.ps case3.ps";
@filenames = split( , $files); # split wrt whitespace

The entries in @filenames become


("case1.ps", "case2.ps", "case3.ps")

The behavior of split( , $str) is equivalent to str.split() in Python,


i.e., whitespace surrounding the words is ignored. Any string delimiter can
be used, e.g.,
$files = "case1.ps, case2.ps, case3.ps";
@filenames = split(, , $files);

results in @filenames as
("case1.ps", "case2.ps", "case3.ps")

The split function can also split with respect to a regular expression, just
as re.split in Python, e.g.,
$files = "case1.ps, case2.ps, case3.ps";
@filenames = split(/,\s*/, $files);

This results in the correct split of $files.


(There is a slight difference between Perl and Python when splitting a
string with respect to whitespace using the regular expression \s+. Leading
and trailing blanks results in an empty string as first and last element in
the returned list, when using Python, whereas Perls split function does not
result in an array element corresponding to the trailing blanks.)
The join command is the inverse of split:
@filenames = ("case1.ps", "case2.ps", "case3.ps");
$cmd = "print " . join(" ", @filenames);

yields $cmd as the string "print case1.ps case2.ps case3.ps".


26 1. Introduction to Perl

1.4.8 Text Processing


A basic issue in text processing is recognizing and replacing parts of a text.
Recognizing text can be done in several ways:
# exact string match:
if ($line eq "double") { # is $line equal to "double"?

# matching with full regular expressions:


if ($line =~ /double/) { # does $line contain double?
# (here, double can be replaced by any valid regular expression)

Note that in Perl, the comparison operators for strings and numbers are
different8 (e.g., eq and ne for strings vs. == and != for numbers, see also
Chapter 1.4.14).
Here is an example regarding substituting double by float everywhere in
a file:
$copyfilename = "$filename.old~~";
rename($filename, "$copyfilename"); # take a copy of the file
open(FILE," <$copyfilename") or die "$0: couldnt open file; $!\n";
$filestr = join("", <FILE>); # read lines and join them to a string
close(FILE);

$filestr =~ s/float/double/g; # substitute

open(FILE, ">$filename"); # write to the orig file


print FILE $filestr; # print the whole (modified) file
close(FILE);

Since the need for such types of file substitutions often arises, Perl offers a
one-line statement for accomplishing the task:
perl -pi.old~~ -e s/float/double/g; *.c

See page 20 for an explanation of the various parts of this command.

1.4.9 String Operations


Strings in Perl are enclosed in single or double quotes, but the type of quotes
affects the string contents, as illustrated next. Double quotes enable variable
interpolation:
$w = World;
$s1 = "Hello, $w!"; # becomes "Hello, World!"

Single quotes preserve $, @, and other special Perl characters:


$s2 = Hello, $w!; # becomes "Hello, $w!"

Multi-line strings are also possible:


8
Python applies == as well as <, <=, >, >= for all data types.
1.4. Frequently Encountered Tasks 27

$s3 = "ordinary strings


can be used for
multi-line
text";

String concatenation is enabled by the dot operator:


$myfile = $filename . _tmp . .dat;

The $myfile variable becomes case1_tmp.dat if $filename is the string case1.


Substrings can be extracted by the substr function, e.g.,
$teststr = 0123456789;
# extract 6 characters, starting
# from the beginning of the string:
$strpart = substr($filename, 0, 5);
# result: 01234

# another example:
$strpart = substr($filename, 3, 5);
# result: 34567

# skipping the first two characters:


$strpart = substr($filename, 2);

# skipping up to the last three characters:


$strpart = substr($filename, -3);

Stripping away leading and trailing blanks in a string is easily carried out
by regular expressions:
$line1 =~ s/^\s*//; $line1 ~= s/\s*$//;

1.4.10 Environment Variables


The environment variables are stored in a Perl hash called ENV. You can
modify, e.g., $ENV{PATH} in the script and it has effect on all child processes
(started by calls to the system function, for instance). Here is an example how
we can read the PATH environment variable, split it into its various directories,
and check each directory if it contains the executable file vtk:
$program = "vtk";
$path = $ENV{PATH}; # /usr/bin:/usr/local/bin:/usr/X11/bin etc.
@paths = split(/:/, $path);
foreach $dir (@paths) {
if (-d $dir) {
if (-x "$dir/$program") {
$program_path = $dir;
last; # jump out of the loop (as break in C and Python)
}
}
}
if (defined($program_path)) {
print "$program found in $program_path\n";
} else { print "$program not found\n"; }
28 1. Introduction to Perl

Note that the regular expression split on colon is Unix specific. On Windows
we need to insert a semi-colon instead (note that /[:;]/ does not give a
cross-platform solution since colon is used in Windows paths, e.g., C:\). Also
note the need for double quotes in the second if test; writing $dir/$program
without double quotes would be an invalid mixture of variables and text (the
slash), or division of two text variables what we need is to construct a new
string using variable interpolation.

1.4.11 Subroutines
Functions in Perl are called subroutines. Subroutines take the form
sub name {
# extract local variables from the argument array @_
# body of routine
...
return # some data structure
}

The arguments are not part of the subroutine heading. Instead, they are
available in the array @_. Output variables are transferred to the calling code
by returning an appropriate data structure, e.g., a list of the various output
quantities. The return statement can be omitted.

A Simple Example of a Subroutine. A subroutine for finding the maximum


value of two numbers can be written straightforwardly as follows:
sub max {
my ($a, $b) = @_;
my $max; # = maximum value of $a and $b
if ($a > $b) { $max = $a$; } else { $max = $b; }
return $max;
}

The my keyword makes variables local to the subroutine9 . Unless you specify
a variable with my it is treated as a global variable whose value is visible
outside the routine as well. Frequently, one maps the @_ array onto suitable
local values using convenient list techniques, e.g.,
my ($a, $b) = @_;

This allows working with scalars, such as $a and $b, instead of the array
entries $_[0] and $_[1]. Alternatively, we can extract $a and $b using the
shift operator:

my $a = shift; # same as shift @_;


my $b = shift;

9
See [?] for a precise explanation of the my keyword.
1.4. Frequently Encountered Tasks 29

Variable Number of Arguments. Here is a subroutine statistics, with a


variable number of arguments, which returns a list containing the average
and the minimum and maximum value of all the arguments:
($avg, $min, $max) = statistics($v1, $v2, $v3, $b); # usage

sub statistics {
# arguments are available in the array @_
my $avg = 0; my $n = 0; # local variables

foreach $term (@_) { $n++; $avg += $term; }


$avg = $avg / $n;

my $min = $_[0]; my $max = $_[0];


shift @_; # swallow first arg., its already treated
foreach $term (@_) {
if ($term < $min) { $min = $term; }
if ($term > $max) { $max = $term; }
}

return ($avg, $min, $max);


}

Call by Reference. Modifying the arguments inside the subroutine, i.e., call
by reference, is enabled by working directly on the @_ array. For example,
swap($v1, $v2); # swap the values of $v1 and $v2

sub swap {
my $tmp = $_[0];
$_[0] = $_[1];
$_[1] = $tmp;
}

That is, @_ contains references to the variables used in the subroutine call10 .
We remark that the swap function is just an example on call by reference; the
elegant Perl way of swapping two variables reads ($v2,$v1)=($v1,$v2).
One can also pass references to variables to subroutines and in this way get
the effect of call by reference. A reference to a variable $a reads \$a. Having
the reference as a variable $a_ref, we can extract its value by ${$a ref}. We
may then write the swap function as
sub swap {
my ($a_ref, $b_ref) = @_; # extract references
# swap the contents of the underlying variables
my $tmp = ${$a_ref};
${$a_ref} = ${$b_ref};
${$b_ref} = $tmp;
}

swap(\$v1, \$v2);
10
Perl applies call by reference, and copying the arguments in @ into local variables
in a my statement simulates call by value.
30 1. Introduction to Perl

Alternatively, we can just swap the references themselves:


sub swap2 {
my ($a_ref, $b_ref) = @_;
# swap references:
my $tmp = $a_ref; $a_ref = $b_ref; $b_ref = $tmp;
}

Another example on using references in Perl appears on page 31.

Keyword Arguments. By using a hash to hold the arguments passed to a


subroutine, one can obtain a very readable syntax and the possibility for
assigning default values to an arbitrary set of the arguments11 . Here is an
example, where we call a subroutine with two parameters, message and file:
$filename = "my.tmp";
print2file(message => "testing hash args", file => $filename);

sub print2file {
my %args = (message => "no message", # default
file => "tmp.tmp", # default
@_); # assign and override
open(FILE,">$args{file}");
print FILE "$args{message}\n\n";
close(FILE);
}

Inside the subroutine we first assign default values to the hash entries and
thereafter we insert the argument list @_, which can be interpreted as a hash
as well. This latter hash might then override our default values. For example,
calling
print2file(file => $filename);

leaves $args{message} as no message, but $args{file} is overwritten by the


$filename variable inside the print2file subroutine. The use of a hash in sub-
routine calls also makes the sequence of arguments irrelevant. The technique
is used throughout Perls Tk module for creating graphical user interfaces
and (see Chapter 1.7).

Omitting Parenthesis in a Call. If a subroutine is declared before you call it,


you can omit the parenthesis in the call statement, e.g.,
sub myproc {
my $file1 = shift; // implicit shift on @_
my $file2 = shift;
...
}
# call myproc without parenthesis:
myproc $myfile, "$yourdir/$yourfile";
11
This is the counterpart to Pythons keyword arguments, see page 111Keyword
Argumentssubsection.175 in [?].
1.4. Frequently Encountered Tasks 31

All the subroutines in the Perl libraries are declared before you use them so
you can omit parenthesis if you desire. Here are some examples:
print "No of iterations=$iter\n";
print("No of iterations=$iter\n");

open TMPFILE, ">$tmpfile";


open(TMPFILE, ">$tmpfile");

system "simulator -q 1.2";


system("simulator -q 1.2");

Multiple Arrays as Arguments. If you want to send several arrays to a sub-


routine, you need to explicitly pass references to the arrays. Otherwise, one
cannot detect where one array stops and the next starts in @_. We shall now
show an example where we transfer two arrays to a subroutine and print
them out simultaneously in a nice format:
@curvelist = (curve1, curve2, curve3);
@explanations = (initial shape of u,
initial shape of H,
shape of u at t=2.5);

# send the two arrays to displaylist, using references


# (\@list is a reference to the array @list):
displaylist(list => \@curvelist, help => \@explanations);

The implementation of the displaylist routine, taking two array arguments


transferred by references, is listed next.
sub displaylist {
my %args = (@_);
# extract the two lists from the two references:
my $list_ref = $args{list}; # extract reference
my @list = @$list_ref; # extract array from reference
my $help_ref = $args{help}; # extract reference
my @help = @$help_ref; # extract array from reference

my $index = 0; my $item;
for $item (@list) {
printf("item %d: %-20s description: %s\n",
$index, $item, $help[$index]);
$index++;
}

# Alternative, without lots of local variables:


$index = 0;
for $item (@{$args{list}}) {
printf("item %d: %-20s description: %s\n",
$index, $item, ${@{$args{help}}}[$index]);
$index++;
}
}
32 1. Introduction to Perl

The output of displaylist looks like this:


item 0: curve1 description: initial shape of u
item 1: curve2 description: initial shape of H
item 2: curve3 description: shape of u at t=2.5

We refer to the Pass by Reference section of perldoc perlsub (or the equiv-
alent text in [?, p. 116-118]) for more information.

1.4.12 Nested, Heterogeneous Data Structures


The problems with displaylist and the need for references also occur in
nested, heterogeneous data structures. Say we want a list such as the curves1
list in page 88Lists and Tuplessubsection.148 in [?]. In Perl we could build
some of its components first, which are straight arrays:
@point1 = (0,0);
@point2 = (0.1,1.2);
@point3 = (0.3,0);
@point4 = (0.5,-1.9);

A list of these points must be a list of references to @point1, @point2, etc.:


@points = (\@point1, \@point2, \@point3, \@point4);

Now, suppose we have an array @xy1 similar to @points. The curves1 array
is supposed to contain a string, @points, another string, and @xy1. Again,
references are required to avoid flattening the structure:
@curves1 = ("u1.dat", \@points, "H1.dat", \@xy1);

It is tedious to write the sublist as separate variables so we can do with


@curves1 = ("u1.dat", [[0,0], [0.1,1.2], [0.3,0], [0.5,-1.9]],
"H1.dat", \@xy1);

That is, lists in square brackets provides a reference to an array.


Indexing is performed with a syntax similar to Python. For example,
$a = $curves1[1][1][0];

yields $a as 0.1.
Nested data structures in Perl must make use of references, and it can
be troublesome to debug such structures. The Data::Dumper module converts
Perl data structures to readable strings: print Dumper(@curves1) results in
the present case in
1.4. Frequently Encountered Tasks 33

$VAR1 = u1.dat;
$VAR2 = [
[
0,
0
],
[
0.1,
1.2
],
[
0.3,
0
],
[
0.5,
-1.9
]
];
$VAR3 = H1.dat;
$VAR4 = [
[
0.3,
0
],
[
0.5,
-1.9
]
];

The Data::Dumper module supports lots of output formats, see perldoc Data::Dumper.
More information about references can be found in perldoc perlreftut.

1.4.13 Testing a Variables Type


An ordinary Perl variable is either a scalar, an array, or a hash. The prefix
determines the type of the variable, so the variable name together with its
prefix shows its type; it is no need to test on the variables type (as in Python).
Writing
$var = 1; # scalar
@var = (1, 2); # array
%var = (key1 => 1, key2 => two); # hash

creates three different Perl variables. Every time we use one of the variables,
the prefix immediately shows its type.
However, when working with references the prefix is always a dollar. The
function ref can be used to test what kind of underlying data structure the
reference is pointing to. The return value in a scalar context is a string, like
SCALAR, ARRAY, or HASH. In a boolean context, ref returns true if its
argument is a reference:
34 1. Introduction to Perl

if (ref($r) eq "HASH") { # test return value


print "r is a reference to a hash.\n";
}
unless (ref($r)) { # use in boolean context
print "r is not a reference at all.\n";
}

The ref function is handy when you work with nested, heterogeneous data
structures. See perldoc -f ref and perldoc perlref for more information.

1.4.14 Numerical Expressions


Perl supports the same numerical expressions as C. Strings are automatically
transformed to numbers when required:
$b = 1.2; # b is a number
$b = "1.2"; # b is a string
$a = 0.5 * $b; # b is converted to a real number before mult.

if ($b < 100) { print "ok\n"; } else { print "error!\n"; }


# prints "ok"

In the last test, the < operator works on numbers, and $b is interpreted as a
number (<, >, ==, =!, etc. are the comparison operators for numbers, whereas
strings must be compared with lt, gt, eq, ne, etc.).

1.4.15 Listing of Files in a Directory


The following statements return a list of files (in the current working direc-
tory) having extensions .ps or .gif:
@filelist = glob("*.ps *.gif");

# alternative:
@filelist = <*.ps *.gif>;

A more sophisticated glob function is also available, see perldoc File::Glob.

1.4.16 Testing File Types


Perl supports a range of tests for classifying files:
if (-f $myfile) { print "$myfile is a plain file\n"; }
if (-d $myfile) { print "$myfile is a directory\n"; }
if (-x $myfile) { print "$myfile is executable\n"; }
if (-z $myfile) { print "$myfile is empty(zero size)\n"; }
if (-T $myfile) { print "$myfile is a text file\n"; }
if (-B $myfile) { print "$myfile is a binary file\n"; }

There are also tests for the size and age of a file:
1.4. Frequently Encountered Tasks 35

$size = -s $myfile;
$days_since_last_access = -A $myfile;
$days_since_last_modification = -M $myfile;

See perldoc perlfunc and search for -f, -d, and so on for information about
file tests.
The stat function gives more detailed results about a file:
($dev,$ino,$mode,$nlink,$uid,$gid,$rdev,$size,
$atime,$mtime,$ctime,$blksize,$blocks) = stat($myfile);

A quote from the description of stat in the man page perlfunc explains what
the various list entries above mean:
0 dev device number of filesystem
1 ino inode number
2 mode file mode (type and permissions)
3 nlink number of (hard) links to the file
4 uid numeric user ID of files owner
5 gid numeric group ID of files owner
6 rdev the device identifier (special files only)
7 size total size of file, in bytes
8 atime last access time since the epoch
9 mtime last modify time since the epoch
10 ctime inode change time (NOT creation time!) since the epoch
11 blksize preferred block size for file system I/O
12 blocks actual number of blocks allocated

There is an alternative stat function in the File::stat module, see perldoc File::stat.

1.4.17 Copying and Renaming Files


Renaming a file is simple:
rename($myfile, "tmp.1"); # rename $myfile to tmp.1

Moving files across file systems is reliably done with the move function in
Perls File::Copy library:
use File::Copy;
move($myfile, "/work/temp") or die "Could not rename file\n";

Copying a file $file to a file $tmpfile is performed with the copy function in
the File::Copy library:
use File::Copy;
copy($file, $tmpfile);
36 1. Introduction to Perl

1.4.18 Creating and Moving to Directories


Creating a directory and moving to a directory are tasks performed with the
mkdir and chdir functions, respectively:

use Cwd; $origdir = cwd; # remember where we are


$dir = "../mynewdir";
mkdir($dir, 0755) or die "$0: couldnt create dir; $!\n";
chdir($dir);
...
chdir($origdir); # move back to the original directory
chdir; # move to your home directory ($ENV{HOME})

Suppose you want to create a new directory perl/projects/test1 in your


home directory, but neither perl, nor projects and test1 exist. Instead of
using repeated mkdir commands, Perl offers the mkpath command, from the
File::Path module, to create the whole path in one statement:

use File::Path;
mkpath("$ENV{HOME}/perl/projects/test1");

1.4.19 Removing Files and Directories


Single files are removed by the unlink statement, e.g.,
unlink("myfile") or die "Could not remove file\n";

A list of files can also be transferred to unlink:


unlink(@files); unlink(glob("*.ps *.gif"));

unlink "myfile", yourfile, @thosefiles, "$file.tmp" or \


die "Could not remove files\n";

Frequently, one wants to remove a directory tree, possibly full of files, an


action that requires the rmtree function from the File::Path library:
use File::Path;
rmtree("mydir");

1.4.20 Splitting Pathnames


Let $fname be a filename containing a possibly long path, e.g.,
$fname = /usr/home/hpl/scripting/perl/intro/hw2a.pl

Occasionally, one wants to split this filename into the basename hw2a.pl and
the directory name /usr/home/hpl/scripting/perl/intro/:
1.4. Frequently Encountered Tasks 37

use File::Basename;
$basename = basename($fname);
$dirname = dirname($fname);

One can also extract the base of the basename, hw2a, either by
$base = $basename;
# or by substituting the file extension by an empty string:
$base =~ s/\.pl$//g;

or by the fileparse function:


($base, $dirname, $extension) = fileparse($fname,".pl");

The fileparse function can take an arbitrary number of possible extensions.

1.4.21 Traversing Directory Trees


The very useful Unix find command can be implemented in a cross-platform
fashion in Perl using the File::Find library and its find function. The basic
recipe for using Perls find goes as follows.
use File::Find;
# run through directory trees dir1, dir2, and dir3, and
# for each file call the user-provided subroutine ourfunc:
find(\&ourfunc, "dir1", "dir2", "dir3");

sub ourfunc {
# $_ contains the name of the selected file
$file = $_;
# process $file
# $File::Find::dir contains the current directory
# (you are automatically chdir()ed to this directory)
# $File::Find::name contains $File::Find::dir/$file
}

We shall now implement a script that lists all files larger than 1Mb in the
home directory tree. The easiest way to extract the size of a file is to write
$size = -s $file;

Our script in Perl might then look like


#!/usr/bin/perl
use File::Find;

find(\&printsize, $ENV{HOME}); # traverse home-directory tree

sub printsize {
$file = $_; # more descriptive variable name...
if (-f $file) { # is $file a plain file, not a directory?
$size = -s $file; # or $size = (stat($file))[7];
if ($size > 1000000) {
printf("%.1fMb %s in %s\n",$size/1000000.0,$file,
$File::Find::dir);
}
}
}
38 1. Introduction to Perl

We recommend to read perldoc File::Find to see the many possibilities that


Perls find function offers.
There is a program find2perl that translates a Unix find command into
the equivalent Perl program. The resulting program is not always easy to
read for newcomers to Perl so writing the Perl script yourself gives better
control of what you want to do. In the present example you can try
find2perl find $HOME -name * -type f -size +2000 -exec ls -s {} \;

and realize that the resulting code has 55 (!) lines and is less cross-platform
than our hand-coded version.

1.4.22 Downloading Internet Files


The libwww-perl package contains numerous modules and scripts for working
with the World Wide Web. You can easily test if libwww-perl is already
installed on your system by trying
perl -e use LWP::Simple

If this one-liner gives an error message, you need to get libwww-perl from
CPAN (see page 54).
The Perl script lwp-download (from the libwww-perl package) fetches a
single file whose URL is known:
lwp-download http://www.ifi.uio.no/~hpl/downloadme.dat

The script looks at the file contents and creates a suitable local filename for
the copy. In this case, downloadme.dat is a text file that lwp-download stores
as downloadme.dat.txt. A second argument to lwp-download can be used to
specify a local filename.
Inside a Perl script we can easily copy a file, given as a URL, to a local
file:
use LWP::Simple;
$URL = "http://www.ifi.uio.no/~hpl/downloadme.dat";
getstore($URL, "downloadme.dat");
# copy only if local file is not up-to-date:
mirror($URL, "downloadme.dat.pl");

or we can load the remote file directly into an array of lines:


@lines = get($URL);

The URL in these examples could also have been an ftp address, e.g.,
ftp://ftp.ifi.uio.no/pub/blab/xite/xite3_4.tar.gz

Check out perldoc LWP::Simple for details regarding more functionality.


1.4. Frequently Encountered Tasks 39

1.4.23 CPU-Time Measurements


Measurement of elapsed time in Perl can be done with the time function:
$t0 = time; # elapsed time in seconds since the epoch
# do tasks...
$elapsed_time = time - $t0;

Because time is measured in seconds, you need to perform efficiency tests


that last several seconds. Timing with finer resolution is possible, see the
Perl FAQ: perldoc -q time under a second.
Throughout this section we assume that the reader is familiar with terms
like epoch, elapsed time, system time, CPU time, and the difference between
children and parent processes, as briefly explained in Chapter 8.10.1CPU-
Time Measurementssubsection.577 in [?].
A more sophisticated function times returns an array with four entries.
The first two represent the user and system times of the current process while
the next two contain the user and system times of the current process child
processes.
@t0 = times;
# do tasks...
system "$time_consuming_command" # child process
@t1 = times;
$user_time = $t1[0] - $t0[0];
$system_time = $t1[1] - $t0[1];
$cpu_time = $user_time + $system_time;
$cpu_time_system_call = $t1[2] - $t0[2] + $t1[3] - $t0[3]

There is also a higher-level module Benchmark, based on the time and times
functions, with various support for timing of Perl scripts. The usuage goes
as follows.
use Benchmark;
$t0 = new Benchmark;
# do some tasks...
$t1 = new Benchmark;
$td = timediff($t1, $t0); # time difference between $t0 and $t1
$nice_td_formatting = timestr($td, noc);
print "tasks: $nice_td_formatting\n";

The output looks like this:


tasks: 9 wallclock secs( 3.12 usr + 0.10 sys = 3.22 CPU)

The Benchmark module has also a function timeit that runs a piece of Perl
code a specified number of times:
use Benchmark;
print "100 runs took", timestr(timeit(100,\&somefunc)), "\n";
40 1. Introduction to Perl

We refer to perldoc Benchmark for more details about this module.


From a pedagogical point of view it might be instructive to write a func-
tion like timeit in the Benchmark module. Doing this we also have the pos-
sibility of tailoring such a timing function to suit our needs. The function,
here called timer, can take four arguments: (i) a function to call, (ii) a list of
arguments to be used in the function to call, (iii) the number of call repeti-
tions, and (iv) the name of the function to call. In Perl we would represent
the first two arguments by a function reference and a reference to a list. The
complete function could then take the following form:
sub timer {
my ($func_ref, $args_ref, $repetitions, $func_name) = @_;
my $t0 = time; # initial elapsed time
my ($u0, $s0, $rest) = times; # initial user and system time
for (my $i = 0; $i < $repetitions; $i++) {
&$func_ref(@$args_ref);
}
my @t1 = times;
printf("$func_name: elapsed=%g, CPU=%g\n",
time - $t0, $t1[0] - $u0 + $t1[1] - $s0);
}

The similar Python function is presented in Chapter 8.10.1CPU-Time Measurementssubsection.577


in [?].

1.4.24 Programming with Classes


Classes are implemented in Perl using quite advanced concepts like references
and packages. Although Perl fans claim that classes in Perl are much more
flexible than those in C++ and Java, it is no doubt that programming with
classes is more weird in Perl than in C++, Java, and Python. Explaining
Perl classes in a couple pages without first covering references and packages
is difficult and therefore omitted here.

1.4.25 Debugging Perl Scripts


Unfortunately, Perl is by default quite silent about errors. The following short
script, which tries to open a non-existing file, illustrates the point:
perl -w -e open(F,"<mynonexistingfile"); close(F);

Perl executes this script without any error message12. The Fatal module can
be used for letting Perl speak up about run-time errors:
perl -e use warnings; use strict; use diagnostics; \
use Fatal qw/open/; local *F; \
open(F,"<mynonexistingfile"); close(F);
12
Python provides instructive run-time messages by default in similar examples
(and the messages can be turned off by, e.g., appropriate exception handling in
the script).
1.4. Frequently Encountered Tasks 41

Note that you must list the functions you want to be verbose, here open. The
reported error message now contains the helpful message
Cant open(F, <mynonexistingfile): No such file or directory

The use warnings, use strict, and use diagnostic commands can help you
detecting statements that are candidates for trouble. However, applying use strict
modules to (most of) the Perl scripts in this appendix will result in lots of
error messages about lack of the main:: prefix for all global variables or an
explicit my or local operator (to make variables local). For quick scripting
this can be a bit annoying. When writing larger scripts, on the other hand,
use strict is a good habit. Here is a sample code demonstrating some im-
plications of use strict:
use strict;
# introduce the global variable $counter for the first time:
$counter = 1; # generates error message
$main::counter = 1; # ok, explicit indication of package name
my $counter = 1; # ok, localizing $counter with the my operator
my $counter; $counter = 1; # equiv. with the line above

The reader is encouraged to take a look at the man pages for the Fatal,
strict, and diagonstic modules. For details on warnings, see perldoc warnings
and man perllexwarn.
Inserting print statements on the fly in the code is an efficient and widely
used debugging method among Perl programmers. Alternatively, the -d op-
tion to a Perl script enables you to interactively debug the script through a
command-line debugger,
perl -d -w mybuggyscript.pl

The -w option turns on many useful warnings about, e.g., unused variables.
The most important commands inside the debugger are s for single step, n
for single step without stepping into subroutines, x for pretty-print of data
structures and variables, and b 85 for setting a break point at line 85. More
detailed information is provided by perldoc perldebug.
There is a Perl/Tk GUI for the Perl debugger, available in the module
ptkdb. Invoke the debugger by

perl -d:ptkdb -w mybuggyscript.pl

There are several Perl debuggers with graphical interfaces, check out the links
in the Perl resources section in doc.html.
Another Perl module is Devel::Trace, which prints each statement prior
to executing it (the same effect as the -x option to Unix shell scripts).
42 1. Introduction to Perl

1.4.26 Regular Expressions


The material on regular expressions explained in a Python context in Chap-
ter 8.2Regular Expressions and Text Processingsection.463 in [?] carries over
to Perl, but the surrounding Perl code is different. To test if a string $str
matches a regular expression contained in a string $pattern, one writes
if ($str =~ /$pattern/) { ... }
A specific example can be
$str = "myfile.tmp";
if ($str =~ /\.tmp$/) { print "$str has extension .tmp"; }
Backslashes and special symbols are preserved in text enclosed in forward
slahes /.../, as in Python raw strings. However, if the regular expression is
to be stored in a double-quoted string, backslashes and special Perl characters
must be preceded by a backslash:
$pattern = "\\.tmp\$";
if ($str =~ /$pattern/) { print "$str has extension .tmp"; }
With single-quoted strings a backslash is a backslash, but Perls variable
interpolation cannot be used.

Pattern-Matching Modifiers. Perl offers pattern-matching modifiers to adjust


the meaning of the dot, ^, $, whitespace, etc. The syntax for applying a
pattern-matching modifier is like
if ($str =~ /$pattern/q) { ... }
where q denotes one or more single-character pattern-matching modifiers from
the following list:
i case-insensitive matching
g match globally, i.e., find all occurrences
s let . match newline as well
m treat string as multiple lines, i.e, change ^
and $ from matching at only the very start or end of
the string to the start or end of any line anywhere
within the string (a line is from a newline to the
next newline)
x extend the patterns legibility by permitting
whitespace and comments
o compile pattern once only (for increased efficiency)
The o modifier is a counterpart to compiling regular expressions in Python.
We can use other delimiters than forward slashes if the /.../ group is
preceded by an m, e.g.,
$found = 1 if $path =~ m#/usr/local/bin#;

Extracting Multiple Matches. Suppose you have a string with several num-
bers. To extract all numbers from this string, without knowing how many
numbers there may be, we can apply the following Perl construct13 :
13
This construct is a counterpart to Pythons findall function in the re module.
1.4. Frequently Encountered Tasks 43

$s = "3.29 is a number, 4.2 and 0.5 too";


@n = $s =~ /\d+\.\d*/g;

The array @n now contains the entries 3.29, 4.2, and 0.5.

Groups. Groups are constructed by enclosing parts of a pattern in parenthe-


sis, cf. Chapter 8.2.4Using Groups to Extract Parts of a Textsubsection.469
in [?]. Perl stores the matches correspodning to groups in the variables $1
(first group), $2 (second group), and so on. An illustrating code segment is
given next.
$interval = "[1.45, -1.99E+01]";
if ($interval =~ /\[(.*),(.*)\]/) {
print "lower limit=$1, upper limit=$2\n"
}

Substitution. The basic syntax of substitution in Perl is


$somestring =~ s/pattern/replacement/g;

implying that pattern is replaced by replacement in $somestring. The pattern-


matching modifier g ensures that all occurrences (not only the first) are being
substituted. Here is a specific example:
# change /usr/bin/perl to /usr/local/bin/perl
$line =~ s/\/usr\/bin\/perl/\/usr\/local\/bin\/perl/g;

The forward slashes in the path names must be quoted because the forward
slash is a delimiter for the substitution operator. Fortunately, this Leaning
Toothpick Syndrome [?, p. 70] can be avoided by choosing another delimiter,
e.g.,
$line =~ s#/usr/bin/perl#/usr/local/bin/perl#g;
# or
$line =~ s{/usr/bin/perl}{/usr/local/bin/perl}g;

See [?, p. 255] for more information on alternative delimiters.


The one-line Perl command for substitution listed on page 20 is so useful
that we repeat it here:
perl -pi.bak -e s/pattern/replacement/g; *.c *.h

The pattern is replaced by replacement everywhere in the files *.c and *.h.
A nice feature is that the -i.bak option leads Perl to take a copy, here with
extension .bak, of the original files.
44 1. Introduction to Perl

Substitution with Groups. The example of switching arguments in function


calls, as covered in Chapter 8.2.9Substitution and Backreferencessubsection.478
in [?] is readily implemented in Perl using the group variables $1, $2, and so
on. If superLibFunc(arg1,arg2) is supposed to be edited to superLibFunc(arg2,arg1),
where arg1 and arg2 are legal variable names in C and the call can contain
extra whitespace, a suitable substitution code segment reads:
$arg = "[^,]+";
$call = "superLibFunc\\s*\\(\\s*($arg)\\s*,\\s*($arg)\\s*\\)";

# perform the substitution in a file stored as a string $filestr:


$filestr =~ s/$call/superLibFunc($2, $1)/g;

# or (less preferred style):


$filestr =~ s/$call/superLibFunc(\2, \1)/g;

print FILE $filestr; # print everything back to file

Note the need for double backslashes when the regular expression is stored
as an ordinary Perl string. The complete script for the substitution is found
in src/perl/swap1.pl. Another version, appearing in the file swap2.pl, has
comments in the regular expression:
$arg = "[^,]+";
$call = "superLibFunc # name of function to match
\\s* # possible whitespace
\\( # left parenthesis
\\s* # possible whitespace
($arg) # first argument plus optional whitespace
, # comma between the arguments
\\s* # possible whitespace
($arg) # second argument plus optional whitespace
\\) # closing parenthesis
";

$filestr =~ s/$call/superLibFunc($2, $1)/gx;

The final x is the pattern-matching modifier for comments and extra whites-
pace in regular expressions. A more typical Perl style is to write the previous
code segment without storing the regular expression in a string variable:
$filestr =~ s{
superLibFunc # name of function to match
\s* # possible whitespace
\( # left parenthesis
\s* # possible whitespace
($arg) # first argument plus optional whitespace
, # comma between the arguments
\s* # possible whitespace
($arg) # second argument plus optional whitespace
\) # closing parenthesis
}{superLibFunc($2, $1)}gx;

This version is available in the swap3.pl script.


1.4. Frequently Encountered Tasks 45

Debugging Regular Expressions. After having spent quite some energy on


figuring out a complicated regular expression, nothing is less exciting than
seeing the regex fail to behave the way you expect. The $\& variable in Perl
is set equal to the complete text matched by the specified pattern and is thus
central when debugging regular expressions.
A Perl counterpart to the Python function debugregex on page 347Debug-
ging Regular Expressionssubsection.481 in [?] is presented below. Because of
the difference in functionality for extracting groups and matches in Perl and
Python, the two versions of debugregex do not have a line by line correspon-
dance.
#!/usr/bin/perl

sub debugregex {
my ($pattern, $str) = @_;
$s = "does " . $pattern . " match " . $str . "?\n";

if ($str =~ /$pattern/) {
# obtain a list of groups (if present):
@groups = $str =~ m/$pattern/g;

# repeat string, but with match enclosed in square brackets:


$match = $&;
$str2 = $str; $str2 =~ s/$match/[$match]/g;
$s = $s . $str2;

if ($groups[0] == 1) {
# ordinary match, no groups (see perlop man page)
} else {
for $group (@groups) {
$s = $s . "\ngroup: " . $group;
}
}
} else {
$s = $s . "No match";
}
return $s;
}

$teststr = "some numbers 2.3, 6.98, and 0.5 are here";


$pattern1 = "(\\d+\\.\\d+)"; # 3 groups (numbers)
$pattern2 = "^(\\w+)\\s+.*\\s+(\\w+)\$"; # 2 groups (some and here)

print debugregex($pattern1, $teststr), "\n";


print debugregex($pattern2, $teststr), "\n";

The output becomes


does (\d+\.\d+) match some numbers 2.3, 6.98, and 0.5 are here?
some numbers 2.3, 6.98, and [0.5] are here
group: 2.3
group: 6.98
group: 0.5
does ^(\w+)\s+.*\s+(\w+)$ match
some numbers 2.3, 6.98, and 0.5 are here?
46 1. Introduction to Perl

[some numbers 2.3, 6.98, and 0.5 are here]


group: some
group: here

The Perl version of debugregex is perhaps less useful than the Python version
since regular expressions are seldom stored in strings in Perl. Instead they
appear directly inside /.../ constructions, and to use debugregex, we need to
copy the regular expression, store it in a string, and quote special characters
before performing the call. However, listing debugregex illustrates useful code
segments that can be reused in your own scripts when debugging regular
expressions.

1.4.27 Exercises
Most of the Python exercises from Chapters 2Getting Started with Python
Scriptingchapter.498Advanced Pythonchapter.453 in [?] are well suited for
implementation in Perl. For an efficient hands-on training with Perl, we rec-
ommend in particular the following set of exercises: 2.4Exercisesexercise.82,
2.6Exercisesexercise.84, 2.7Exercisesexercise.85, 2.11Exercisesexercise.111, 3.2Exercisesexercise.164,
3.15Exercisesexercise.198, 3.7Exercisesexercise.169, 3.16Exercisesexercise.199,
8.7Exercisesexercise.486, 8.12Exercisesexercise.491, 8.18Exercisesexercise.497,
and ??.
File handles for standard input and standard output, used in Exercise 2.6Exercisesexercise.84
in [?], have the names STDIN and STDOUT, respectively, in Perl.
Regarding Exercise 3.15Exercisesexercise.198 on page 128Exercisesexercise.198
in [?], check out Perls documentation of the localtime function for extract-
ing the correct date information (just write the operating system command
perldoc -f localtime).
The relevance of Exercise 8.17Exercisesexercise.496 in [?] for Perl pro-
grammers is minor, because Perl has a special variable $/, the input record
separator, which you can set to an empty string "" to make Perl read input
in a paragraph-by-paragraph style. For example,
$/ = "";
@paragraphs = <SOMEFILE1>;
# each entry in @paragraphs is now a paragraph

# alternative:
$/ = "";
while (<>) {
# $_ is now a paragraph
}

# read the whole file into a string:


undef $/; # slurp mode
$filestr = <SOMEFILE2>;
1.4. Frequently Encountered Tasks 47

Exercise 1.2. Make a flexible file/directory remove function.


On page 120Removing Files and Directoriessubsection.186 in [?] we present
a flexible Python function remove for removing one or more files and directo-
ries. The function can be called with a string or list of strings as argument.
Implement the same function in Perl and explain why there is no need to test
for the argument type in Perl in this application. 

Exercise 1.3. Make a generic debug print function in Perl.


Make a Perl counterpart to the Python debug function from page 111Functionssection.173
in [?]. Test the function on a nested, heterogeneous list of list and hash struc-
tures. (Hint: use the Data::Dumper module.) 

Exercise 1.4. Use Getopt::Long to parse the command line.


The Getopt::Long module allows you to specify and handle command-line
options in Perl code. Read the man page documentation of this module and
apply it to improve the simviz1.pl script. 

Exercise 1.5. Interpret a Perl script.


What do the following commands do?
1> perl -ne print if $. > 10 and $. < 22 outbox
2> perl -lne END{print $.} file
3> perl -lne BEGIN{$c=0}$c++ if /#/;END{print "$ARGV: $c"} \
file1 file2 file3

One hint might be to find the source of the man pages and search for special
constructions.
Write a Bourne shell script that first writes an explanation of what the
commands do to the screen and then demonstrates the command through
an example (generate file(s) in the Bourne shell script before running the
commands). 

Exercise 1.6. Automatic editing of script headers.


The purpose of this exercise is to write a script which changes typical Perl
headings (she-bang lines) like
#!/usr/bin/perl

or other hardcoded paths, to the Perl header


: # *-*-perl-*-*
eval exec perl -w -S $0 ${1+"$@"}
if 0; # if running under some shell

Such a tool is useful when you want to make portable scripts that are always
interpreted by, e.g., your own Perl installation. Search a set of directory trees
and for each executable file that is not a directory, check if the first line
matches text of the form #s*/.*/perl!, and if so, edit the file automatically.
In case you become interested in the magic of Perl headers, you are en-
couraged to run the script src/perl/headerfun.sh. 
48 1. Introduction to Perl

Exercise 1.7. Mail a collection of files and directories.


The usual way to mail files electronically is to include them as attach-
ments. However, manual attachment is inconvienient if there are many files.
Sometimes one also wants to mail complete directory trees, and a mail script
with the following interface would then be attractive:
mailfiles.pl -s subject -c comments -a hpl@ifi.uio.no \
-x (.tex,.pl,.py,.c) doc/README src app/test1 app/test2

In this interface, the -s option is used to specify a subject for the email,
-c is used to insert a text in the body of the email, -a is used to assign an
email address, and -x takes a list of extensions of files to be mailed (list items
are separated by comma, and the list is enclosed in parenthesis; the quotes
are used to tell the shell that the list is one command-line argument). The
rest of the command-line arguments are the file or directory names to be
mailed. For example, doc/README is typically a file, whereas src, app/test1,
and app/test2 could be directory trees. All files with extensions coinciding
with those given through the -x option will be picked out from these directory
trees and mailed. If the -x is not specified, all files in the trees are to be sent.
The inner workings of mailfiles.pl will consists of (i) creating a list of all
files to be mailed, then (ii) packing these files into one single file using a tar
facility, (iii) compressing the tarfile, and finally (iv) including the compressed
tarfile as an attachment in a mail. The body of the mail should consist of the
users comments given through the -c option to mailfiles.pl in addition to
a list of all the files that are included in the tarfile.
Perl has some useful modules for accomplishing the tasks in the mailfiles.pl
script: Archive::Tar creates tarfiles [?, p. 164], Compress:Zlib can be used for
gzip/gunzip file compression, and Mail::Send [?, p. 349] provides functions for
sending emails with attachments. Information on the usage of these modules
is provided by the associated man pages.
Write the mailfiles.pl script and write also the inverse script mail2files.pl,
which packs out the attached tarfile from an email saved to file.


Exercise 1.8. Make Perl less silent about errors.


Create a directory without write permissions and move to this directory.
(Relvant Unix commands are mkdir tmp; chmod a-w tmp.) Run the following
Perl command to create a new directory:
perl -e mkdir("tmp2", 0755);

The one-line script is executed without error messages, but no directory is


created. The similar Python command,
python -c import os; os.mkdir("tmp2")
1.4. Frequently Encountered Tasks 49

results in an error message (Permission denied). Use the Fatal module in


Perl so that mkdir prints an error message if creation of a new directory is
unsuccessful.


Exercise 1.9. Debug a Perl substitution command.


Assume that we want to remove comments like
/*
REMAINING TASKS:
...
...
*/

from a set of files. We try the following one-line substitution command in


Perl:
perl -pi.old~~ -e s#/\*\s*REMAINING TASKS.*?\*/##gs; f1.c f2.c f3.c

However, this command does not seem to work. Can you explain what is
going on in detail and create a script that works as intended?


Exercise 1.10. Document a script using POD.


First, read about Perls documentation system POD. A good source is
perldoc perlpod. Then, find a non-trivial script you have written in Perl and
equip it with documentation in man-page style, using the POD tool. Use
the programs pod2html and pod2man to generate man pages for the script in
HTML and nroff formats. 

1.4.28 Building and Using Modules


The simplest possible Perl module consists of some Perl code stored in a file
with extension .pm. Here is a sketch:
# module file: MyMod.pm

package MyMod;

$logfile = ""; # variable shared among subroutines

sub set_logfile {
($logfile) = @_;
print "logfile=$logfile\n";
}

sub another_routine {
print "inside another_routine\n";
for $arg (@_) { print "argument=$arg\n"; }
}

1;
50 1. Introduction to Perl

The package statement defines a namespace MyMod, i.e., all variables and func-
tions in a user code must be prefixed by the package name: MyMod::. This
avoids name clashes with other modules that have subroutines or variables
with the same names (e.g., set_logfile or logfile).
A module file must end with a statement that evaluates to true. The
standard choice is 1; and Perl will issue a compilation error if you forget this
crucial statement.
In the users code the MyMod module is imported by writing use MyMod.
This import statement will be successful only if Perl knows where to find
your module. There are four ways of telling Perl about your directory of
modules:
1. apply the use lib statement, e.g., use lib "/home/hpl/lib";
2. set the PERLLIB or PERL5LIB environment variables,
3. modify Perls list of search directories for modules, or
4. install the module in the official Perl library directories.
Suppose we have stored the MyMod.pm in the subdirectory src/intrp/perl
under the directory stored in the environment variable scripting. Let us
explain the details of the four technqiues for making Perl aware of our MyMod
module. The use lib statement has the form
use lib "/home/hpl/scripting/src/perl";

We can work with paths built from environment variables, to make the script
more portable:
use lib "$ENV{scripting}/src/perl";

The increased portability comes at a cost of requireing all users to have


particular environment variables set correctly. (This requirement makes the
technique less applicable to CGI scripts.)
As an alternative to use lib you can include your directories with modules
in an environment variable PERLLIB or PERL5LIB. Here is an example in a Bash
start-up file:
export PERLLIB=$PERLLIB:$scripting/src/perl

Another solution is to modify Perls list of directories, @INC, used to search


for modules [?, p. 171]:
BEGIN { unshift @INC, "$ENV{scripting}/src/perl"; }
use MyMod;

In the case Perl appears to have trouble with finding libraries, it will print the
content of the @INC variable, thus letting you check if all required directories
are present or not.
1.4. Frequently Encountered Tasks 51

Installing a module in the official Perl library directories should be done


in terms of a Makefile.PL file automatically generated by the h2xs program.
An excellent description of this process is available in the item Use h2xs to
generate module boilerplate in the Effective Perl book [?].
Being sure that Perl will find your new module, you can start using it. A
complete test is shown next.
#!/usr/bin/perl

use lib "$ENV{scripting}/src/intro/perl";


use MyMod;

MyMod::set_logfile(tmp.log);
print "MyMod::logfile=$MyMod::logfile\n";
$p1 = "just some text";
MyMod::another_routine($p1, mytmp.tmp);

The visibility of the modules functions and variables can easily be controlled
(public versus private access). We refer to the description in [?] for details.
The example of a Perl module presented in this section is very simple;
Perl has many additional features to control the behavior of a module. An
overview of making modules is provided by the question How do I create a
module? in the Perl FAQ (see link from doc.html). You can look up [?] for
a more comprehensive description of making modules.

1.4.29 Binary Input/Output


Perl has the functions pack and unpack for handling binary data, and these
functions work in much the same way as in Python (see Chapter 8.3.6Binary
Input/Outputsubsection.507 in [?]). For example, converting the Perl variable
$np to the binary int format in C can be done with a call to pack:
$cvar = pack(i, $np);

Similarly, converting $r to the binary double format in C is done by


$cvar = pack(d, $r);

The format specification for the most common formats is the same as in
Python, see perldoc -f pack for a complete list. Writing an array of real
numbers on file in C double format can be peformed by the loop
foreach $r (@array) { print FILE pack(d,$r); }

Reading C doubles from file can be done using read to get a specified number
of bytes and then unpack to convert a binary number to a Perl variable:
@array = ();
$d = length(pack(d)); # no of bytes in a C double
# read $d bytes into $data in each pass in the loop:
while (($n = read(FILE, $data, $d)) == $d) {
push(@array, unpack(d, $data); # convert to Perl variable
}
52 1. Introduction to Perl

Forcing a file to be treated as a binary file, which is necessary on some oper-


ating systems, is obtained by calling binmode(FILE), see perldoc -f binmode.

1.5 Installing Perl and Additional Modules


This section explains how to install Perl and the modules referred to in this
document. The set up follows Appendix A in [?]. You should therefore make
sure that you have set the right environment variables and made the right
directories according to [?] before proceeding.

1.5.1 Installing Basic Perl


Download a tarfile with the stable Perl distribution (see doc.html for an ap-
propriate link) and pack it out in $SYSDIR/src/perl. Go to the new directory,
which was created when unpacking the tarfile. First you need to run a con-
figuration script to set up the proper makefiles for compiling Perl:
sh ./Configure -Dprefix=$PREFIX -sde

If this script executes successfully, the next step is to compile Perl by typing
make. To test if the building of Perl was successful, type make minitest (can be
omitted). To install Perl, type make install. Object files and other temporary
files from building Perl can be removed by the command make clean.

1.5.2 Manual Installation of Perl Modules


The CPAN archive contains a large number of Perl modules. If you lack some
special functionality in Perl and plan to develop it yourself, check out CPAN
very often you will find that others have already done the job! Modules from
CPAN can be installed either manually or automatically. We recommend to
use the automatic procedure described below under the heading Automatic
Installation of Perl Modules.
Manual installation of modules from CPAN follows this recipe:
1. Download a tarfile, e.g., SomeMod-0.01.tar.gz. (Here 0.01 denotes the
version number of the SomeMod module). Pack it out in some suitable
directory, e.g., $SYSDIR/src/perl/tools (if you follow the set-up in this
appendix):
gzip -dc SomeMod-0.01.tar.gz | tar xvf -

2. Go to the modules directory:


cd SomeMod-0.01

3. All modules shall contain a makefile called Makefile.PL. Run


perl Makefile.PL
1.5. Installing Perl and Additional Modules 53

and then
make
Sometimes you will see that a module depends on other modules, so you
need to install these first.
4. To test the module you can write make test. This step is optional.
5. To install the module such that Perl can find it when you say use SomeMod
in some script, you need to run
make install
This will copy the necessary files to the directories where Perl is installed.
If you have your own installation of Perl, this will work fine, but if Perl
was installed by your system administrator, you probably do not have
write permission in the relevant directories. In that case, you can install a
private copy of the module by specifying a path to the desired installation
directory (say /home/hpl/lib:
perl Makefile.PL LIB=/home/hpl/lib
When you use the module in a Perl script you need to make Perl aware
of where the module is installed, e.g., by writing
use lib "/home/hpl/lib";
prior to use SomeMod (see Chapter 1.4.28).

1.5.3 Automatic Installation of Perl Modules


Perl has a special module called CPAN for automating installation of Perl
modules. The first time you make use of this utility you need to configure
your set-up for the CPAN module. Invoke the CPAN shell:
perl -MCPAN -e shell

and answer no to the question Are you ready for manual configuration?.
This will initiate an automatic configuration, which might be sufficient in
many cases, especially if you have your own Perl installation. In case you do
not like the decisions that were made, invoke the CPAN shell again and issue
the command o conf init to revisit the dialog. For example, you may desire
to specify a LIB option when the CPAN module runs perl Makefile.PL for
you (see the description of manual installation of Perl modules).
To install a module, invoke the CPAN shell as described above and just
say install and the module name, e.g.,
install Tar

The latest version of the Tar module will be fetched from a nearby CPAN
site, together with all the modules that Tar depends on and that you do not
already have. The tarfiles are unpacked and the installation procedure is run.
You can also install modules without using the interactive CPAN shell:
54 1. Introduction to Perl

perl -MCPAN -e install Tar

We remark that even in this case some of the installation procedures will
prompt the user for extra information, but the default values will often be
sufficient. You can perform a manual installtion if you run into problems (by
default, the CPAN shell packs out the sources in .cpan/build in your home
directory just go to a modules directory and issue the commands required
for manual installation).
To learn about the modules, look at their man pages (write, for instance,
perldoc Tar).

1.5.4 The Required Perl Modules


The Perl examples in this document require some CPAN modules. You can
install these manually or automatically. In the latter case, run
perl -MCPAN -e install Bundle::libnet
perl -MCPAN -e install Tk # Perl/Tk
perl -MCPAN -e install LWP::Simple # WWW tools

# for CGI programming (dynamic Web pages):


perl -MCPAN -e install CGI::Debug
perl -MCPAN -e install CGI::QuickForm

# for an exercise:
perl -MCPAN -e install Tar

# for the Regression module in "Python Scripting for


# Computational Science" book:
perl -MCPAN -e install Algorithm::Diff

1.6 Perl Versus Python


For newcomers to Perl and Python it may be difficult to judge which language
to use in a given project. Below are some comments on pros and cons of these
two languages.

1.6.1 Pythons Advantages


Python is easy to learn. Many people find Python to be considerably
easier to learn than Perl. Python has also been used for teaching pro-
gramming in high schools with success14 .
Python has a very clean syntax. Pythons clean syntax makes programs
easy to read and modify, also for others than the author. Going back to
a Python script after a year is normally much easier than going back to
14
See link in doc.html.
1.6. Perl Versus Python 55

the same type of code in Perl. A Python slogan15 is there should be one
and preferably only one obvious way to do it, in contrast to Perls
Theres More Than One Way To Do It (see Chapter 1.3). The one-way-
to-do-it philosophy in Python eliminates to a large extent the need for
detailed programming standards in a project. Different Perl cultures use
different styles and constructs, a fact that can be confusing when novice
Perl programmers read other peoples code.
Python has easy-to-use data structures. Lists and dictionaries (hashes)
can be heterogeneous and arbitrarily nested in Python and Perl, but
in Python there is no need to work with references and the associated
dereferencing syntax as in Perl. That is, the Python syntax is intuitive
and makes it easy to work with complicated, nested data structures.
Morover, the keys in Python dictionaries can be arbitrary (immutable)
objects, whereas the keys in Perl hashes are limited to plain text.
Python has full and natural support for object orientation. Object-oriented
programming is more awkward in Perl as it was added at a late stage in
the development of the language.
Pythons application domain is wide. Python is a combination of a script-
ing language for gluing existing applications (for which Unix shells, Tcl,
and Perl are competing tools) and a system programming language with
object-orientation and support for complicated, nested, heterogenesous,
user-defined data structures (an area where C++, Java, and to some
extent Perl are competing tools). Perl also supports complicated data
structures, but with a less attractive syntax; in Python you can work
directly with the objects in the structure, while in Perl you must work
through references, a fact that clutters the code with combinations of
@, $, and backslashes and makes it less readable. (Compare, for example,
the displaylist subroutine on 31 with the corresponding straightforward
implementation in Python.)
Python supports multi-language programming in an easy way. There are
several tools that make it easy to combine Python with C, C++, or
Fortran libraries, as shown in Chapters 5Combining Python with For-
tran, C, and C++chapter.280, 9Fortran Programming with Numerical
Python Arrayschapter.597, and 10C and C++ Programming with Nu-
merical Python Arrayschapter.631 in [?]. Combining Perl with C++, and
in particular Fortran, is less well supported.
Python programs can easily be equipped with a GUI. There are well-
developed interfaces to various GUI tools, e.g., wxWindows, Qt, Gtk, Java
Foundation Classes (JFC), and MFC (Microsoft Foundation Classes), be-
sides this books main GUI library: Tk. Although Tk, and other libraries
like Gtk, can be used from Perl as well, we find that Pythons simple
15
Type import this in a Python interpreter to see this and more slogans.
56 1. Introduction to Perl

class construct provides a convenient and efficient way of simplifying the


development of independent and reusable GUI components.
Python supports numerical programming. There are Python modules for
efficient numerics and visualization, making Python the most widespread
and perhaps the preferred language for scripting in scientific computing.
Python offers extensive error checking. Descriptive error messages are
issued when something goes wrong at run-time. Python programs never
dump core; you always get a traceback, showing where and why the error
arose. Perl is much more silent (by default; see Chapter 1.4.25).
Python statements can be written in an interactive shell. There is a very
user-friendly interactive shell in the IDLE tool. Especially in combination
with Python interfaces to your Fortran and C/C++ codes (as explained
in Chapter 5.3A Simple Computational Steering Examplesection.299 in
[?]), this interactive mode allows you to work in a Matlab-style with your
own libraries. To use Perl interactively, you need to invoke a debugger.
Python and Java are seamlessly integrated. A special version of Python,
called Jython, is implemented in 100% pure Java. Java classes can be
used directly in Python programs and Python classes and Java code can
make use of Python classes.
The popularity of Python is fast growing. Many application areas where
Perl has a strong tradition, e.g. databases, interactive Web pages, or text
processing, see an increasing use of Python.
It only takes a few hours to get started and be productive in Python. After
an efficient and gentle start, you can pick up the more advanced parts of
the language in parallel with applying it to your programming projects.
The learning curve is, according to my teaching experience, clearly steeper
in Perl.

1.6.2 Perls Advantages


Perl and Python have quite similar functionality and application areas. Even
if you know Python, there are many reasons why you should seriously consider
picking up the Perl language too. Some good reasons are mentioned below.
Perl is probably the most widespread and popular dynamically typed lan-
guage today. There is a wide collection of Perl-based tools freely available
on the Internet. Therefore, it is likely that you encounter Perl scripts in
your work, and knowing some basic Perl puts you in a position where you
can fix weaknesses or tailor the scripts to your own needs.
Perl has an extensive set of add-on modules that can be freely obtained
from the searchable CPAN archive. You will often experience that there
is a Perl module available for your scripting problem at hand.
Python scripts can make use of Perl code through the pyperl tool.
1.6. Perl Versus Python 57

Perl is particularly popular for writing CGI scripts, i.e., creating inter-
active Web pages. Perls CGI tools are more comprehensive than those
available for Python.
Perl has (together with Python) more comprehensive interfaces to op-
erating system commands and file operations than most other scripting
languages.
Regular expression syntax is best documented in a Perl context, i.e., you
need to understand Perl code to look up all the good regular expression
documentation. Although Python supports most of Perls regular expres-
sion syntax, many programmers (even Python fans) prefer Perl for heavy
text processing with regular expressions.
Perl has some nice features that allow small scripts to be expressed as
very compact one-line operating system commands.
Perls Theres More Than One Way To Do It principle gives the pro-
grammer greater flexibility (and perhaps more fun and challenges) than
when using Python to solve a problem.
Perl is usually faster than Python up to about twice as fast for typical
application areas of scripting in computational science and engineering.

1.6.3 Efficiency
It may be of interest to compare the efficiency of Perl versus Python. We
have made a series of scripts in the directory tree src/efficiency for com-
parisons of Perl, Python, and to some extent Tcl, C, and C++. The relative
efficiency of the datatrans1.* scripts is reported in Chapter 2.2.7Efficiency
Measurementssubsection.76 in [?]. In that example we found that Perl ran
almost twice as fast as Python. In the following we present two other appli-
cations concerning reading text files with regular expressions and performing
mathematical operations in for loops. We remark that the relative efficiency
of the two languages reported here depends on the version of the interpreters,
the C compiler used to compile them, and the hardware platform. The best
approach of extrapolating the results to your own programming life is to
repeat the tests on your own machine.

Interpreting Text Files via Regular Expressions. The subdirectory regex of


src/efficiency tests the efficiency of reading a big data file consisting of
textual array data. Each line takes the form
[2,10,0]=16.7

meaning that a three-dimensional array has the value 16.7 for the indices
(2, 10, 0). The purpose is then to read this file, line by line, extract the indicies
and the array value, using the regular expression
\[(\d+),(\d+),(\d+)\]=(.*)
58 1. Introduction to Perl

and fill the corresponding array item. The Python implementation, in regexread.py,
applies a three-dimensional NumPy array, whereas the Perl implementation,
in regexread.pl, stores the numbers in a one-dimensional array (since three-
dimensional arrays in Perl go via references and hence needs some extra loops
initially). The data file is created by the makedata.py script.
As usual in the efficiency tests in [?], we normalize the CPU time by the
CPU time of the fastest implementation. Perl then ran at 1.0 and Python at
3.3.
(A faster solution to this programming would be to avoid regular expres-
sions and instead split the line first with respect to =, strip the first (left-hand
side) string, remove the braces (first and last character), and then split with
respect to comma. This implementation and its relative efficiency to using
regular expression is left as an exercise.)
If we instead load a data file where the array data are written in NumPy
format, such that a plain eval(file.read()) type of statement recreates the
array, the Python script (readeval.py) is very short and ran at 2.8 CPU time
units. These numbers are independent of the size of the array as long as the
array fits into memory.
From these tests on interpreting text files with regular expression, we may
conclude that Perl is more than three times faster than Python. The reading
in Python is a bit faster if the array is in NumPy format (which is expected
since there is less text to interpret in this format). We remark that large
array structures should perferably be stored in binary format. The difference
between Perl and Python is then much smaller.
Rb
Function Evaluations. The next test concerns computing the integral a f (x)dx
by the Trapezoidal rule:

1 X
n1
1
I= f (a) + f (a + ih) + f (b), h = (b a)/n .
2 i=1
2

A program implementing this formula, which is in fact the purpose of Exer-


cise 4.5Exercisesexercise.230 in [?], will spend almost all its time on evaluating
the function f (x).
The scripts for computing I are found in src/efficiency/integration;
int.py is the Python implementation, int.pl is the corresponding Perl ver-
sion, and int_vec.py is a vectorized Python version based on NumPy. Since
int.pl ran faster than int.py, we scale the CPU times by the former. The
Perl version then ran at 1.0, plain Python at 1.2, and the vectorized Python
version at 0.1. Matlab implementations, based on both plain loops and vec-
torization, ran at approximately the same speed as the corresponding Python
versions.
1.7. GUI Programming with Perl/Tk 59

1.7 GUI Programming with Perl/Tk

Two types of graphical interfaces to programs are covered in this section.


Chapter 1.7 explains how to apply the Tk library from Perl for creating
standard graphical user interfaces in Perl programs. Chapter 1.8 addresses
graphical user interfaces in Web pages. These interfaces are handled by CGI
scripts.
The exposition in this section assumes the reader to be familiar with
Python/Tkinter programming from Chapter 6Introduction to GUI Programmingchapter.306
([?]) in general and Chapter 6.1Scientific Hello World GUIsection.307 ([?])
in particular.
The Tk library for easy creation of graphical user interfaces can be called
from Perl, using a Perl module referred to as Perl/Tk. Programming Perl/Tk
is, not surprisingly, very similar to programming Python/Tkinter. The pur-
pose of this section is therefore to illustrate the syntax differences a Python/Tkinter
programmer faces when working with Perl/Tk. The purpose is to get the
reader started with the Perl/Tk syntax such that further GUI programming
in Perl scripts for a Python/Tkinter programmer becomes a trivial task.
All the GUI scripts *.pl in the src/perl directory correspond to the simi-
lar Python scripts with the same basenames and extension .py in src/py/gui.

Documentation. A book [?] is devoted to GUI building with Perl. This book
also contains a complete reference to Perl/Tk widgets. Tk comes, of course,
with man pages that can be accessed with perldoc: perldoc Tk::Scale, for
instance. There is also a Perl/Tk quick reference book [?]. The doc.html con-
tains links to the Perl/Tk FAQ as well as to electronic Perl/Tk introductions.
The source of the Perl/Tk distribution contains a demo of Perl/Tk wid-
gets. Go to the Perl/Tk packages directory, then to its demos subdirectory,
and write perl widgets. A GUI appears with a list of Tk demos. For each
demo you can view the source code in a separate window. Newcomers to
Perl/Tk might be confused by the Perl framework for organizing the de-
mos. However, concentrating on the specific widget commands and knowing
the meaning of the qw operator (see page 22) should be sufficient for taking
advantage of this collection of Perl/Tk examples.

Megawidgets. Perl/Tk contains the Tk package plus many of the megawidgets


in the Tix library (which is originally a Tcl/Tk extension). The collection of
widgets is quite rich and includes scrolled widgets, notebooks, advanced lists,
combo boxes, and so on. A nice feature of Perl/Tk is that the module comes
with all the necessary C files for compilation, that is, there is no need to
install the native Tcl/Tk libraries or other Perl extension modules. All you
have to say to install Perl/Tk is perl -MCPAN -e install Tk.
We shall in the following write the Scientific Hello World GUI scripts
from Chapter 6.1Scientific Hello World GUIsection.307 in [?] using Perls Tk
60 1. Introduction to Perl

extension. This will point out how easy it is to use Tk from Perl once you are
familiar with the Tk widgets, their functions, arguments, and functionality.
The scripts presented in this section are found in the directory src/perl.

1.7.1 The First Perl/Tk Encounter


We start out with creating a GUI as in Figure 6.1The First Python/Tkinter
Encounterfigure.311 on page 222The First Python/Tkinter Encounterfigure.311
in [?]. In this GUI the user can fill in a number, push the equals button and
see the sine of the number written to the right in the window. The script
coded in Perl/Tk is called hwGUI1.pl and looks as follows.
: # *-*-perl-*-*
eval exec perl -w -S $0 ${1+"$@"}
if 0; # if running under some shell

use Tk;
# create main window perl variable ($main_win) to hold all widgets:
$main_win = MainWindow->new();
$top = $main_win->Frame(); # create frame
$top->pack(-side => top); # pack frame in main window

$hwtext = $top->Label(-text => "Hello, World! The sine of");


$hwtext->pack(-side => left);

$r = 1.2; # default
$r_entry = $top->Entry(-width => 6, -relief => sunken,
-textvariable => \$r);
$r_entry->pack(-side => left);

$compute = $top->Button(-text => " equals ", -command => \&comp_s);


$compute->pack(-side => left);

sub comp_s { $s = sin($r); }

$s_label = $top->Label(-textvariable => \$s, -width => 18);


$s_label->pack(-side => left);

MainLoop();

Let us explain the script line by line. The first three lines are standard and
ensure that we call the first Perl interpreter in our path, see Chapter 1.1.1.
Any Perl script using the Tk extension must begin with
use Tk;

Unfortunately, many Perl installations do not contain the Tk package. You


can easily test whether your Perl interpreter understands Tk commands by
running
perl -e use Tk

If you get an error message of the type


1.7. GUI Programming with Perl/Tk 61

Cant locate Tk.pm in @INC(@INC contains: ....

you need to get Perl/Tk installed. Instructions are provided in Chapter 1.5.
Before we can create and pack widgets with Perl/Tk we need to make a
main window:
$main_win = MainWindow->new();

Having the main window, we normally start with making a frame to hold all
our widgets and subframes:
$top = $main_win->Frame(); # create frame
$top->pack(-side => top); # pack frame in main window

The label Hello, World! The sine of is constructed in this way in


Perl/Tk:
$hwtext = $top->Label(-text => "Hello, World! The sine of");
$hwtext->pack(-side => left);

The reader should notice the close similarity with the corresponding code
expressed in Python/Tkinter:
hwtext = Label(top, text="Hello, World! The sine of")
hwtext.pack(side=left)

If desired, we can merge the creation of the Label object and its packing in
one statement:
$top->Label(-text => "Hello, World! The sine of")
->pack(-side => left);

Now we do not have any variable holding the label anymore, which means that
we cannot update (configure) the label later. Usually, this is a disadvantage.
A text entry tied to the Perl variable $r is created by calling the Entry
method in our top frame:
$r = 1.2; # default
$r_entry = $top->Entry(-width => 6, -relief => sunken,
-textvariable => \$r);
$r_entry->pack(-side => left);

The width of the text entry equals 6 characters, and the entry is displayed
with a sunken relief, giving a 3D effect in the GUI. Technically, we send a
reference to the variable $r, denoted by \$r in Perl, as argument to the entry
widget.
The button in our GUI is supposed to compute $s = sin($r) when being
pressed. The corresponding Perl/Tk code is
$compute = $top->Button(-text => " equals ", -command => \&comp_s);
$compute->pack(-side => left);

sub comp_s { $s = sin($r); }


62 1. Introduction to Perl

The subroutine to be invoked by pressing the button is assigned by the


-command argument. The value of the argument is a reference to the sub-
routine, produced by the syntax \&comp_s in Perl.
The last widget is a label for displaying the result of the sine computation.
The text in this label must reflect the value of the Perl variable $s:
$s_label = $top->Label(-textvariable => \$s, -width => 18);
$s_label->pack(-side => left);

or shorter:
$top->Label(-textvariable=>\$s, -width=>18)->pack(-side=>left);

Finally, we need to call the event loop when all GUI components are declared
and packed:
MainLoop();

This loop waits for the users events, such as pressing buttons and writing
text, and performs the corresponding actions as defined by the widgets. If
you forget to call MainLoop, nothing will be shown on the screen and the script
just hangs.

1.7.2 The Similarity of Python/Tkinter and Perl/Tk


At this stage we can outline how a typical Python/Tkinter statement trans-
lates to Perl/Tk. Consider
widget_var = widget_type(parent_widget, opt1=v1, opt2=v2,
command=myfunc)

In Perl/Tk this takes the form


$widget_var = $parent_widget->widget_type(-opt1 => v1, -opt2 => v2,
-command => \&myfunc);

1.7.3 Binding Events


Binding the event pressing return in the text entry $r_entry to calling the
comp_s subroutine is accomplished with this statement in Perl/Tk:

$r_entry->bind(<Return>, \&comp_s);

Exercise 1.11. Perl/Tk version of the GUI in Chapter 6.2Adding GUIs to


Scriptssection.345 in [?].
Write a GUI for the simulation and visualization script in Chapter 1.2,
following the simvizGUI1.py Python script from Chapter 6.2Adding GUIs to
Scriptssection.345 in [?].
1.8. Web Interfaces and CGI Programming 63

Hint: The PhotoImage class in the Python script takes the name Photo in
Perl/Tk, and the construction is also slightly different. Another difference
between Python/Tkinter and Perl/Tk is that the from_ argument in Scale
has the stright name from in Perl/Tk. Check out the Tk::Image and Tk::Scale
man pages with perldoc.


A List of Common Widget Operations. Chapter 6.3A List of Common Widget


Operationssection.356 in [?] describes a script demoGUI.py containing many of
the most common Tk widgets. The script serves as both a demo of the look
and workings of the widgets and as a kind of quick reference for typical widget
constructions. Newcomers to Tk GUI programming may find the demo script
very useful as one can simply copy code segments from this script to get a
widget up and running. A simplified (and actually quite different) Perl/Tk
version of the demoGUI.py script is found in demoGUI.pl in the src/perl direc-
tory.

1.8 Web Interfaces and CGI Programming

The number of Perl users increased dramatically in the latter half of the
1990s when Perls powerful text processing features were recognized to make
interactive Web pages based on the Common Gateway Interface (CGI) much
easier than in traditional languages like Fortran, C, C++, and even Java.
Perl is now a core technology in the Internet programming world, and several
modules for CGI programming are available.
Perl offers more CGI and network programming modules than Python.
On the other hand, the powerful Zope and Plone tools for managing dynamic
Web pages are based on and programmable from Python. The optimal CGI
programming environment is therefore likely to be a Python program calling
up special Perl functionality when desired. This is fortunately a reality as the
company ActiveSate has created a tool pyperl for calling Perl from Python.
Some Python tools for CGI programming are explained in Chapter 7Web
Interfaces and CGI Programmingchapter.408 in [?]. Here we shall have a look
at similar tools in Perl and see how they apply to the examples from in [?].
This implies that the present section is written for readers that have grasped
the basics of CGI programming in Python from Chapter 7Web Interfaces and
CGI Programmingchapter.408 in [?].

1.8.1 Web Versions of the Scientific Hello World Program


We start out with a Web interface to our Scientific Hello World program.
Figure 7.1Introductory CGI Scriptssection.409 on page 290Introductory CGI
Scriptssection.409 in [?] displays a Web page where the user can fill in a value
r, click the equals button, and see sin(r) being written in a new Web page, see
64 1. Introduction to Perl

Figure 7.2Introductory CGI Scriptsfigure.411 on page 290Introductory CGI


Scriptsfigure.411 in [?]. The Web page with a text entry is a plain HTML
file:
<HTML><BODY BGCOLOR="white">
<FORM ACTION="hw1.pl.cgi" METHOD="POST">
Hello, World! The sine of
<INPUT TYPE="text" NAME="r" SIZE="10" VALUE="1.2">
<INPUT TYPE="submit" VALUE="equals" NAME="equalsbutton">
</FORM></BODY></HTML>

This file is identical to the hw1-py.html file from page 291Web Forms and
CGI Scriptssubsection.412 in [?], used in conjunction with a CGI script in
Python, except from the ACTION parameter, which now specifies a Perl script
hw1.pl.cgi to be called. This CGI script is almost a line-by-line translation of
the similar script in Python. Perl has a module called CGI with easy access to
form variables. As an example, the value of the form parameter r is extracted
by calling the param function in the CGI module:
use CGI;
$form = CGI->new();
$r = $form->param("r");

With this information, and having a look at the Python counterpart hw1.py.cgi,
we can easily create the CGI script in Perl:
#!/usr/local/bin/perl
use CGI;
# required opening of all CGI scripts with output:
print "Content-type: text/html\n\n";
# extract the value of the variable "r" (in the text field):
$form = CGI->new();
$r = $form->param("r"); $s = sin($r);
# print answer (very primitive HTML code):
print "Hello, World! The sine of $r equals $s\n";

The script is found in the file src/perl/hw1.pl.cgi. Observe that the output
yields an incomplete HTML code, but it will most likely be correctly shown
in a browser.
In an improved version this CGI script we let the user stay within the
same page, i.e., the Web page acts as a sine calculator. This is accomplished
by letting the script generate the HTML code for the Web form:
#!/usr/local/bin/perl
use CGI;
# required opening of all CGI scripts with output:
print "Content-type: text/html\n\n";
# extract the value of the variable "r" (in the text field):
$form = CGI->new();
if (defined($form->param("r"))) {
$r = $form->param("r"); $s = sin($r);
} else {
$r = "1.2"; $s = ""; # default
1.8. Web Interfaces and CGI Programming 65

}
# print form with value:
print <<EOF;
<HTML><BODY BGCOLOR="white">
<FORM ACTION="hw2.pl.cgi" METHOD="POST">
Hello, World! The sine of
<INPUT TYPE="text" NAME="r" SIZE="10" VALUE="$r">
<INPUT TYPE="submit" VALUE="equals" NAME="equalsbutton"> $s
</FORM></BODY></HTML>
EOF

The if test can in principle be omitted, that is, we can write


$r = $form->param("r");

even when no form variables are available (which is the case the first time
the script is launched). However, the sine computation requires that $r is a
valid real number. The script can be found in the file src/perl/hw2.pl.cgi
and is a line-by-line translation of the similar Python script.

1.8.2 Debugging CGI Scripts in Perl with CGI::Debug


Debugging CGI scripts quickly becomes challenging as the browser responds
with the standard message Internal Server Error if something goes wrong. A
compulsory step is to run the CGI script on the command line to detect errors
and warnings. Form variables can be provided as command-line arguments
to Perl scripts. For example,
perl myscript.cgi formvar1=some text var2=another answer q=4

yields three form variables: formvar1, var2, and q.


The Perl module CGI::Debug is very useful for debugging CGI scripts
when run in a browser, because it prints informative and helpful messages
about errors. The following demonstrations will indicate the basic features of
CGI::Debug.
Because many standard Perl installations do not contain the CGI::Debug
module, you may need to install the module, most easily accomplished using
the CPAN shell, see page 53. In case you install CGI::Debug as part of your
own Perl installation, the CGI script needs to run this Perl interpreter. That
is, the header in the CGI script must contain the complete path to your Perl
interpreter, e.g.16 ,
#!/some/long/path/to/my/own/bin/perl -w
use CGI::Debug;

Alternatively, you can notify any Perl interpreter about the location of the
directory where CGI::Debug is installed, e.g.,
16
The need for a wrapper script, as explained in Chapter 7.1.4A General Shell
Script Wrapper for CGI Scriptssubsection.421 in [?], is not necessary unless you
have linked your Perl interpreter with special, local shared libraries.
66 1. Introduction to Perl

#!/usr/local/bin/perl -w
use lib /some/long/path/to/my/own/lib; # CGI::Debug is here
use CGI::Debug;

Writing use CGI::Debug turns on debugging facilities. This will not affect a
working script. We have included CGI::Debug and introduced two errors in a
version of the hw2.pl.cgi script (the erroneous version is called hw2e.pl.cgi).
After the CGI object form is created we write some statements that trigger
warnings and errors:
print "$undefined_var\n"; # warning
print "$form->param("undefined_key")\n"; # error
# CGI scripts are normally not allowed to write files:
open(FILE, ">myfile") or die "Cannot open myfile!";

Running the script in a browser leads to a compilation error from Perl and
a corresponding report written to the browser by CGI::Debug. The report is
something like
syntax error at /some/path/hw2e.pl.cgi line 13,
near ""$form->param("undefined_key"
Execution of /some/path/hw2e.pl.cgi aborted due to compilation errors.

Your program doesnt produce ANY output!

Parameters
----------
equalsbutton = 6[equals]
r = 3[1.2]

followed by a listing of the contents of all environment variables related to


CGI programming.
Let us look at the effect of removing the wrong hash index. Running
the script again leads to a run-time error since the CGI script is run by a
nobody who is not likely to have permission to new files for writing. The
open command will work if the owner of the directory creates a file myfile
and gives all others permission to write on it, or if the owner gives all others
write access to the current directory. CGI::Debug will in the present example
write the error message from the die statement in a new Web page. Of course,
both the undefined key and the undefined variable will be caught by running
the script from the command line (the owner executes the script has write
permission in the current directory). Opening a file, on the other hand, works
well when running the script directly from the command line, but not when
the script is run in a browser. This latter type of error is very common, and
CGI::Debug helps you detect it easily.
Finally, we can remove the open statement and the script runs correctly,
but Perl issues a warning about the undefined_var variable. This warning is
written by CGI::Debug at the end of the Web page.
1.8. Web Interfaces and CGI Programming 67

1.8.3 Using Perls CGI Module to Construct Forms


The CGI module has numerous functions for writing HTML code. Consider
the CGI script hw2.pl.cgi from page 64. Instead of writing the HTML code
as plain text in print statements, we employ some hopefully self-explanatory
functions from the CGI module:
#!/usr/local/bin/perl
use CGI;
$wp = CGI->new();
print $wp->header,
$wp->start_html(-title=>"Hello, Web World!",
-BGCOLOR=>white),
#$wp->start_form(-action=>hw3.pl.cgi), # default
$wp->start_form,
"Hello, World! The sine of ",
$wp->textfield(-name=>r, -default=>1.2, -size=>10),
"\n", $wp->submit(-name=>equals), " ";
if ($wp->param()) {
$r = $wp->param("r"); $s = sin($r);
} else { $s = sin(1.2); }
print $s, "\n", $wp->end_form,
$wp->end_html, "\n";

The complete source code is found in src/perl/hw3.pl.cgi. This code can be


made nicer by removing the explicit appearance of the CGI object, i.e., the
$wp-> construct. The statement

use CGI qw/:standard/;

imports the most standard CGI functions directly into the namespace so do
not need to work with a CGI object. This enables the following more readable
version of the script:
#!/usr/local/bin/perl
use CGI qw/:standard/;
print header,
start_html(-title=>"Hello, Web World!",
-BGCOLOR=>white),
start_form,
"Hello, World! The sine of ",
textfield(-name=>r, -default=>1.2, -size=>10),
"\n", submit(-name=>equals), " ";
if (param()) { $r = param("r"); $s = sin($r); }
else { $s = sin(1.2); }
print $s, "\n", end_form, end_html, "\n";

The source code is found in src/perl/hw4.pl.cgi. More information on fea-


tures in the CGI module can be obtained from the well-written man page; just
type perldoc CGI.
68 1. Introduction to Perl

Using Perls CGI::QuickForm Module to Construct Forms. Plain output of


HTML text or using the CGI utilities for generating HTML code quickly leads
to somewhat lengthy scripts, especially when forms with many elements need
to use HTML tables for achieving satisfactory layout. The CGI::QuickForm
module takes the specification of a form and automatically generates an
HTML page, typeset in a nicely formatted way. The Web version of our
interactive Scientific Hello World program can be expressed as follows using
the CGI::QuickForm module:
#!/ifi/ganglot/k00/inf3330/www_docs/packages/SunOS/bin/perl
use CGI qw/:standard/;
use CGI::QuickForm;

show_form(
-ACCEPT => \&on_valid_form, # must be supplied
-TITLE => "Hello, Web World!",
-FIELDS => [
{ -LABEL => Hello, World! The sine of ,
-TYPE => textfield, -name => r,
-default => 1.2, },
],
-BUTTONS => [ {-name => compute}, ], # "submit" button(s)
);

sub on_valid_form {
my $r = param(r);
my $s = sin($r);
print header, $s; # write new page with the answer
}

You can find the source in src/perl/hw5.pl.cgi. Unfortunately, few stan-


dard Perl installations contain the CGI::QuickForm module. We therefore
have to hardcode the path to our own Perl interpreter, which knows about
CGI::QuickForm (if you have followed the instructions in Chapter 1.5), or
we can apply a use lib statement to notify the Perl interpreter where our
CGI::QuickForm module is installed. Since CGI scripts are run by a nobody
user and not yourself, the script will not be aware of your special environment
variables. You therefore need to hardcode the complete path in a use lib
statement. If you need to make use of environment variables that are under
your control, you can wrap a Bourne shell script around your CGI script;
details are provided in Chapter 7.1.4A General Shell Script Wrapper for CGI
Scriptssubsection.421 in [?]. When you let a CGI script use your own Perl,
make sure that the Perl interpreter and its libraries are readable for all users.
The CGI::QuickForm man page offers a good introduction to the many
features of this module. Although the use of CGI::QuickForm in our Scientific
Hello World example is an overkill, the advantages of the module become
more apparent when the form contains several form elements and when vali-
dation of user input is desired.
One example showing that CGI::QuickForm is handy is the
src/perl/simviz1.pl.cgi
1.8. Web Interfaces and CGI Programming 69

script, which is a counterpart to the CGI script cg/src/python/simviz1.py.cgi


from Chapter 7.2Adding Web Interfaces to Scriptssection.425 in [?]. The pur-
pose of this script is to create a Web interface to the oscillator code from
Chapter 2.3Gluing Stand-Alone Applicationssection.89 in [?]. The user can
type in values of the mathematical parameteres and get a plot of the solution.
2.5. GUI Programming with Tcl/Tk 101

pack $hw.text -side top -pady 20

# second frame consists of the sine computations:


set sine [ frame $top.sine ]
pack $sine -side top -pady 20 -padx 10
label $sine.intro -text "The sine of "
set r 1.2; # default
entry $sine.r -width 6 -relief sunken -textvariable r
button $sine.eq -text " equals" -command comp_s -relief flat
label $sine.s -textvariable s -width 14
pack $sine.intro $sine.r $sine.eq $sine.s -side left

# third frame consists of a quit button (we drop the


# frame and add the button straight into the toplevel widget)
button $top.quit -text "Goodbye, GUI World!" -command exit \
-background yellow -foreground blue
pack $top.quit -side top -pady 5 -fill x

bind $sine.r <Return> comp_s


proc comp_s { } { global r; global s; set s [ expr sin($r) ] }
bind . <q> exit

2.5.6 Configuring Widgets


In Chapter 6.1.6An Alternative to Tkinter Variablessubsection.322 in [?] we
also develop a version of the Scientific Hello World GUI where we do not tie
variables to widgets but instead extracts their content with a get function and
set their values either with set or configure. The hwGUI7_novar.tcl script
is a version of hwGUI7.tcl employing this strategy. After having created an
entry,
label .sine.intro -text "The sine of "
entry .sine.r -width 6 -relief sunken

we insert a default value:


.sine.r insert end 1.2; # default value

The default value can, of course, also be inserted at construction time of the
entry.
When we need to read the contents of the entry (in the comp_s routine)
we simply use the get command:
set r [ .sine.r get ]
set s [ expr sin($r) ]

Updating the contents of the result label is done with configure:


.sine.s configure -text $s

where .sine.s is the name of a label widget. This means that the comp_s
function takes the form
102 2. Introduction to Tcl/Tk

proc comp_s { } {
global .sine.r .sine.s; # access global widgets
set r [ .sine.r get ]
set s [ expr sin($r) ]
.sine.s configure -text $s
}

2.5.7 The Grid Geometry Manager


The grid geometry manager works as explained in Chapter 6.1.8An Intro-
duction to the Grid Geometry Managersubsection.333 in [?], and a Tcl/Tk
version of the Scientific Hello World GUI is found in the file hwGUI9_grid.tcl.
With a background from these examples and some experience with the
Tcl syntax, the reader should have a good starting point for being productive
with Tcl/Tk programming, or at least, the reader should be able to look up
Tk documentation written with Tcl syntax (for example, the original Tk man
pages). The Tk functions and options are almost identical regardless of the
language we use. Unfortunately, some small differences exist, and these can
be annoying. If a particular Tk construction triggers an error when expressed
in another language, you should try to find man pages for the Tk bindings
in the language in question and check if the option has a different name or
different legal values.

Exercise 2.1. Tcl/Tk version of the GUI in Chapter 6.2Adding GUIs to Scriptssection.345
in [?].
Write the simviz1.py script from Chapter 6.2Adding GUIs to Scriptssection.345
in [?] in Tcl/Tk. You can use the Scientific Hello World GUIs as starting
point, but you need to consult the Tk man pages to find the proper syntax
for a slider and a picture in Tcl/Tk.
Hint: PhotoImage in the Python script is a counterpart to the image pro-
cedure in Tk, being an example of different names in the Python and Tcl
interfaces to Tk. The construction of the image is also different. 

A List of Common Widget Operations. Chapter 6.3A List of Common Widget


Operationssection.356 in [?] presents a demo script containing many of the
most important widgets in Tk and Tk extensions. A simpler type of demo
script has been written in Tcl/Tk and [incr Widgets], see demoGUI.tcl in
src/tcl. The demoGUI.tcl script serves as rough overview of common GUI
constructions in Tcl/Tk and [incr Widgets].
Bibliography

Vous aimerez peut-être aussi