Vous êtes sur la page 1sur 9

Rexx for everyone

Scripting with Free Software Rexx implementations


David Mertz
Developer
Gnosis Software, Inc.

04 February 2004

It's easy to get lost in the world of "little languages" -- quite a few have been written to scratch
some itch of a company, individual, or project. Rexx is one of these languages, with a long
history of use on IBM operating systems, and good current implementations for Linux and
other Free Software operating systems. Rexx occupies a useful ecological niche between the
relative crudeness of shell scripting and the cumbersome formality of full systems languages.
Many Linux programmers and systems administrators would benefit from adding a Rexx
implementation to their collection of go-to tools.

About Rexx
The Rexx programming language was first created in 1979, as a very high level scripting language
that had a particularly strong facility for text processing tasks. Since Rexx's inception, IBM has
included versions of Rexx with most of its operating systems -- all the way from its mainframes, to
its mid-level systems, to end user OS's like OS/2 and PC-DOS. Other OS makers, such as Amiga,
have also integrated Rexx as an always-available system scripting language. A number of ISVs,
moreover, have created Rexx environments for many platforms. Somewhat late in the game, ANSI
officially adopted a standard for Rexx in 1996.
Nowadays (especially on Linux or BSD-derived OS's), most of those older implementations
of Rexx are primarily interesting as historical footnotes. However, two currently maintained
implementations of Rexx remain available across a wide range of platforms, including Linux,
MacOSX and Windows: Regina and NetRexx. Regina is a native executable, available as Free
Software source code, or pre-compiled to a large number of platforms -- install it pretty much as
you would any other programming language interpreter. NetRexx is an interesting hybrid. The
language is a derivative of plain Rexx. Much like Jython or Jacl, NetRexx compiles Rexx-like
source code into Java bytecodes, and (optionally) runs the resulting .class file within a JVM.
NetRexx is an IBM project for compiling Rexx-like code for a Java Virtual Machine. In capabilities
and programming level, Rexx can be compared most closely to bash plus the GNU text utilities
Copyright IBM Corporation 2004
Rexx for everyone

Trademarks
Page 1 of 9

developerWorks

ibm.com/developerWorks/

(throwing in grep and sed for good measure); or maybe to awk or Perl. Certainly Rexx has more
of a quick-and-dirty feel to it than do, e.g., Python, Ruby, or Java. The verbosity -- or rather,
conciseness -- of Rexx is similar to that of Perl, Python, Ruby or TCL. And Rexx is certainly Turingcomplete, enables modules and structured programming, and has libraries for tasks such as
GUI interfaces, network programming, and database access. But its most natural target is in
automation of system scripting and text processing tasks. As with shell scripting, Rexx allows
very natural and transparent control of application; but compared to bash (or tcsh, ksh, etc.), Rexx
contains a much richer collection of built-in control structures and (text processing) functions.
Stylistically, the IBM/mainframe roots of Rexx show in its case-insensitive commands; and to a
lesser degree in the relative sparsity of punctuation it uses (preferring keywords to symbols). I tend
to find that these qualities aid readability; but this is mostly a matter of individual taste.

A start at streams and stacks


As a simple conceit, let me present a number of versions of a very simple utility that lists files
and numbers them. One feature that Rexx has in common with shell scripting is that it has a
relatively impoverished collection of functions for working with the underlying operating system -mostly limited to the the ability to open, read, and modify files. For most anything else, you rely on
external utilities to perform the job at hand. The utility numbered-1.rexx simply processes STDIN:

Listing 1. numbered-1.rexx
#!/usr/bin/rexx
DO i=1 UNTIL lines()==0
PARSE LINEIN line
IF line\="" THEN
SAY i || ") " || line
END

The ubiquitous instruction PARSE can read from various sources. In this case, it puts the next line of
STDIN into the variable line. We also check if a line is blank, and skip showing and numbering it if
so. For example, combined with ls we can get:

Listing 2. Piping command to numbered-1


$ ls | ./numbered-1.rexx
1) ls-1.rexx
2) ls-2.rexx
3) ls-3.rexx
4) ls-4.rexx
5) ls-5.rexx
6) ls-6.rexx
7) numbered-1.rexx
8) numbered-2.rexx

You can just as easily pipe any other command in.


A concept at the core of Rexx is juggling multiple stacks or streams. In bash-like fashion, anything
in Rexx that is not recognized as an internal instruction or function is assumed to be an external
utility. There is no special function or syntax for calling an external command. Taking advantage
Rexx for everyone

Page 2 of 9

ibm.com/developerWorks/

developerWorks

of the Regina utility rxqueue, which puts output onto the Rexx stack, we can write a "numbered ls"
utility as:

Listing 3. ls-1.rexx
#!/usr/bin/rexx
"ls | rxqueue"
DO i=1 WHILE queued() \= 0
PARSE PULL line
SAY i || ") " || line
END

Some instructions in Rexx may explicitly specify a stack to operate on; but other instructions
operate within an environment which you configure with the ADDRESS instruction. STDIN, STDOUT,
STDERR, files, and in-memory data stacks are all handled in a uniform and elegant fashion.
Above we used the external rxqueue utility, but we can similarly redirect output of utilities right
within Rexx. For example:

Listing 4. ls-2.rexx
#!/usr/bin/rexx
ADDRESS SYSTEM ls WITH OUTPUT FIFO '' ERROR NORMAL
DO i=1 WHILE queued() \= 0
PARSE PULL line; SAY i || ") " || line; END

It might appear that the ADDRESS command is grabbing the output of just the ls utility; but it is
actually changing the general execution environment for later external calls. These examples were
run on a case-retentive/case-insensitive filesystem; under many systems, you will have to quote
"ls" to maintain its case. This behaves identically:

Listing 5. ls-5.rexx
#!/usr/bin/rexx
ADDRESS SYSTEM WITH OUTPUT FIFO '' ERROR NORMAL
ls
DO i=1 WHILE queued()\=0; PARSE PULL ln; SAY i||") "||ln; END

Any subsequent external commands, if they are run in the default SYSTEM environment, will direct
their output to the default FIFO.(first-in-first-out). You could also output to a LIFO instead (either
named or default) -- the difference being that a FIFO adds to the "bottom" of the stack, and a LIFO
to the "top." The instructions PUSH and QUEUE correspond to LIFO and FIFO operations on the
stack. The instruction PULL or PARSE PULL take a string off the top of the stack.
Another useful stack to look at is that of the command-line arguments to a Rexx script. For
example, we might want to execute an arbitrary command in our numbering utility, not always ls:

Listing 6. numbered-1.rexx
#!/usr/bin/rexx
PARSE ARG cmd
ADDRESS SYSTEM cmd WITH OUTPUT FIFO ''
DO i=1 WHILE queued()\=0; PARSE PULL ln; SAY i||") "||ln; END

Rexx for everyone

Page 3 of 9

developerWorks

ibm.com/developerWorks/

The script in action:

Listing 7. Passing command to numbered-1


$ ./numbered-2.rexx ps -a -x
1)
PID TT STAT
TIME
2)
1 ?? Ss
0:00.00
3)
2 ?? Ss
0:00.19
4)
51 ?? Ss
0:01.95
[...]

COMMAND
/sbin/init
/sbin/mach_init
kextd

PARSE PULL can be used to pull lines from user input. Following the example of the execution of the
argument cmd, you could write a shell or interactive environment in Rexx (perhaps running either
external utilities or built-in commands, much like bash).

Stem variables and associative arrays


In Rexx -- somewhat like in TCL -- to a large extent everything is a string. Having stacks and
streams composed of lines gives you a simple list or array of strings. But mostly, strings simply
act like other datatypes as needed. For example, a string that contains a suitable representation
of a number (digits, decimal, an exponent "e", etc.) can be used in arithmetic operations. For
processing reports, log files, and the like, this is exactly the behavior you want.
Rexx, however, does have one additional standard datatype: associative arrays. In Rexx they go
under the name "stem variables," but the concept is very similar that of dictionaries in many other
languages. The syntax for stem variables will be oddly familiar to users of OOP languages like
Java, Python, or Ruby: a dot separates "objects" and their "attributes." This is not really objectorientation, but the syntax does (accidentally) highlight the degree to which an object resembles a
particularly robust dictionary; there are OOP extensions to Rexx out there, but this article will not
address them.
Not every string is valid Rexx symbol -- which restricts the keys in the dictionary -- but Rexx is
pretty liberal about its symbol names, compared to most languages. E.g.

Listing 8. Using stem variables in Rexx


$ cat stems.rexx
#!/usr/bin/rexx
foo.X_!1.bar = 1
foo.X_!1.23 = 2
foo.fop.fip = 3
foo.fop = 4
SAY foo.X_!1.bar # foo.X_!1.23 # foo.fop.fip # foo.fop # foo.fop.NOPE
$ ./stems.rexx
1 # 2 # 3 # 4 # FOO.FOP.NOPE

A couple features stand out in this example. We set a value for both a stem and its compound (e.g.
foo.fop and foo.fop.fip). Also notice that the undefined symbol foo.fop.nope simply stands for
its own spelling, absent an assignment to the contrary. This lets us skip quotes in most situations.
Case of names is normalized to upper case in most Rexx contexts.
Rexx for everyone

Page 4 of 9

ibm.com/developerWorks/

developerWorks

One useful trick is to set a value for the dotted stem, which then acts as a default value for
compound names based on the stem. For the next example, we also make use of the capability to
ADDRESS the sequential numbered symbols of a compound name as an output environment:

Listing 9. ls-3.rexx
#!/usr/bin/rexx
ls. = UNDEF
ADDRESS SYSTEM ls WITH OUTPUT STEM ls.
DO i=1
IF ls.i == UNDEF THEN LEAVE
SAY i || ") " || ls.i
END

As soon as the loop gets to some compound variable name that was not populated by the output
of the external ls utility, we detect the default value of "UNDEF" and leave the loop (if the output
might contain that string, a false collision could occur, however).
Rexx also has an error-handling system that lets you SIGNAL conditions and handle them
appropriately. Instead of checking for a default compound value, you can also catch the access to
an undefined variable. E.g.:

Listing 10. ls-6.rexx


#!/usr/bin/rexx
ADDRESS SYSTEM ls WITH OUTPUT STEM ls.
SIGNAL ON NOVALUE NAME quit
DO i=1
SAY i || ") " || ls.i
END
quit:

Just to round our ls variants out, here is one more that uses a file for its I/O:

Listing 11. ls-4.rexx


#!/usr/bin/rexx
ADDRESS SYSTEM ls WITH OUTPUT STREAM files
DO i=1
line = linein(files)
IF line = "" THEN LEAVE
SAY i || ") " || line
END
rm files

Since the output stream is a regular file, it is probably good to remove it at the end.

Text processing functions


The brief examples above will give readers a bit of the feel of Rexx as a programming language.
You can also, of course, define your own procedures and functions -- in separate module files, if
you wish -- and call them either with the CALL instruction or using parenthesized arguments, as
with some of the examples in this article that use standard functions.
Rexx for everyone

Page 5 of 9

developerWorks

ibm.com/developerWorks/

Perhaps the greatest strength in Rexx as a text processing language is its useful collection of builtin string manipulation functions. Somewhere over half of all the standard Rexx functions are for
working with strings, with a chunk of others thrown in for quite readable manipulation of bit vectors.
Moreover, even bit vectors are often manipulated (or read in) as strings of ones and zeros:

Listing 12. bits.rexx


#!/usr/bin/rexx
SAY b2c('01100001') b2c('01100010')
/* --> a b */
SAY bitor(b2c('01100001'), b2c('01100010')) /* --> c
*/
SAY bitor('a','b')
/* --> c
*/
EXIT
/* Function in ARexx, but not ANSI Rexx */
b2c: PROCEDURE
ARG bits
return x2c(b2x(bits))

A nice feature of Rexx's text handling functions is the naturalness of treating lines as
being composed of whitespace-separated words. For textual reports and log files, easily
ignoring extraneous whitespace is quite useful -- 'awk' does something similar, but Python's
string.split() quickly gets more "busy" in describing the same operations. In fact, "arrays" in
Rexx just amount to whitespace-separated strings. The PULL instruction will pull out variables from
a general template pattern for a line, which at a minimal case allows word division:

Listing 13. pushpull.rexx


#!/usr/bin/rexx
PUSH "a b c d e f"
PULL x y " C " z
/* pull x and y before the C, remainder into z */
SAY x # y # z
/* --> A # B # D E F */

Further dividing strings that may or may not have been pulled with a template is elegant. Functions
like wordpos(), word(), wordindex() or words(), subword() let you refer to the "words" in a string
as if they made up a list, e.g.:

Listing 14. Working with words in a string


seuss = "The cat in the hat came back"
thehat = wordpos('the hat', seuss)
SAY "'came'" is wordlength(seuss, thehat+2) letters long
/* --> 'came' IS 4 LETTERS LONG */

Of course, you also get a rich collection of character-oriented functions as well. It is equally easy to
work with character positions using functions like reverse(), right(), justify(), center(), pos(),
or substr() (and others).
Another batch of the built-in functions let you work with dates and numbers in a flexible, reportoriented, way. That is, numbers can be read and written in a variety of formats, with arbitrary
(configurable) precision. Dates, similarly, can be read, written and converted among many formats
with standard function calls (e.g. day-of-week, days-in-century, European versus US dates, etc.).
The flexibility with dates and numbers is probably less often necessary in writing system scripts
Rexx for everyone

Page 6 of 9

ibm.com/developerWorks/

developerWorks

and processing log files than it is in working with semi-structured output reports from database
applications. But when you need it, it is much more robust to have well-tested built-in functions
than to write your own ad-hoc converters and formatters.

Wrap up
Coming more out of an IBM "big-iron" environment than from Unix systems, Rexx is little-known
to many Linux programmers and systems administrators. But there remains an important Linux
niche where Rexx is a better scripting solution than either the "too-light" bash or ksh shells, or the
"too-heavy" interpreted programming languages like Python, Perl, Ruby, TCL, or maybe Scheme.
For quick and easily readable scripts that perform text manipulation on the inputs and outputs of
external processes, Rexx is hard to beat, and not hard to learn or install.

Rexx for everyone

Page 7 of 9

developerWorks

ibm.com/developerWorks/

Resources
Regina is the LGPL Rexx implementation used in writing and testing the examples in this
article, and for most people will be the best choice for an environment to install. Regina
is available on an extremely wide range of platforms, and is fully ANSI compliant (with a
few extensions added). The site also contains links to a number of useful Rexx libraries for
working with application areas like Tk, Curses, Sockets, SQL, and others.
NetRexx is an IBM project for compiling Rexx code for a Java Virtual Machine. While it is
certainly possible to use NetRexx for client/workstation scripting needs, the main focus of
NetRexx is to allow more rapid development of server-side Java applications (JSP and
related technologies).
The Rexx Language Association is a a general advocacy group for the Rexx programming
language. Their site contains miscellaneous links to useful libraries and other resources,
including details on the Rexx ANSI Standard.
IBM Hursley also maintains a nice list of Rexx resources. You can find tutorials, references,
and links to a variety of modern libraries. The Hursley site also has a page on the history of
Rexx.
The IBM REXX Family runs on host systems, such as VM/ESA, VSE/ESA, and MVS/ESA,
and workstation environments, such as AIX, OS/2, Linux, and Windows.
There is even a Mod_Rexx package for the Apache Web server.
Find more resources for Linux developers in the developerWorks Linux zone.

Rexx for everyone

Page 8 of 9

ibm.com/developerWorks/

developerWorks

About the author


David Mertz
David Mertz is owner and chief consultant for Gnosis Software, Inc. whose
corporate slogan is "We Know Stuff!" (and we do). His fondness for IBM dates back
embarrassingly many decades. You can reach David at mertz@gnosis.cx; you
can investigate all aspects of his life at his personal Web page. Suggestions and
recommendations on past or future articles are welcome. Check out his book, Text
Processing in Python.
Copyright IBM Corporation 2004
(www.ibm.com/legal/copytrade.shtml)
Trademarks
(www.ibm.com/developerworks/ibm/trademarks/)

Rexx for everyone

Page 9 of 9

Vous aimerez peut-être aussi