Vous êtes sur la page 1sur 15

ParallelKnoppix Tutorial

Michael Creel 15th December 2004


Abstract This note shows how to set up a Linux cluster for MPI parallel processing using ParallelKnoppix, a bootable CD.

1 Introduction
ParallelKnoppix is a bootable CD that allows creation of a Linux cluster in very little time. The cluster is temporary - once it is shut down the computers are in their original state. There are no OS or software requirements. This tutorial shows how to create a cluster, step-by-step, using screenshots. For more general information on ParallelKnoppix, please see the above link, or this document. If you happen to be reading the html version of this tutorial and you nd it to be useful, please get the pdf version and print it. It will be a lot easier on your eyes. The pdf version is here.

2 A sample session
This section presents a series of screenshots that illustrate the setup of the cluster and the parallel execution of a simple program. The next section gives a brief example of a more useful application.

2.1 Booting up
Is your entire hard disk formatted with NTFS partition(s)? This lesystem is proprietary and there is not enough public information about how to write to it reliably when not running Windows. Insteady of using a complicated work-around, I suggest switching now to a computer that has a FAT32 (or EXT2, EXT3, ReiserFS, etc.) partition with enough free space for your working les. Or, repartition your hard disk to add one such partition. The QTParted program on the CD can help you with this, but it can also help you erase your data if you dont know what youre doing, so be careful. Supposing the above is not a problem, place the CD in one of your computers, and boot up. You might like to press F2 and/or F3 to see some options that you can use. Options you should not use are knoppix26, and any language other than English. Sorry about that last one. Upon booting the master computer we see the screen

2 A SAMPLE SESSION

Maybe this should be symmetricknoppix? Oh, well...

2.2 Conguration
All conguration is done using a script that you can start from the ParallelKnoppix/SetupParallelKnoppix menu item, as follows:

In what follows, some of what you will see while the script executes is skipped, when

2 A SAMPLE SESSION

you have no options to affect the way the script runs. Just take it easy and compare what you see on your screen with the shots here. The rst thing to do is to choose which network card connects to the cluster (this is an easy one if you only have one card, which is the usual case)

This card has been assigned the IP address 192.168.0.1. You can ssh into the master node later, either as knoppix or as root, with the password parallelknoppix. This is useful for getting your work onto the cluster, or getting your results off.

2 A SAMPLE SESSION

Next, the terminal server is started to boot your slave computers. Parallelknoppix uses a modied version of the terminal server script from ClusterKnoppix (Vandermissen, 2004), since it allows the nodes to boot in text mode. That way theyll have more memory for useful work. Do this as follows.

Click on OK. Next, select the network card for the cluster.

2 A SAMPLE SESSION

Choose the range of IP addresses for the slaves. They should go from 192.168.0.2 up to 192.168.0.X, where X is the total number of computers in the cluster. 1 In this example, Im using only one slave.

In this next screen, accept the default - export the CDROM.

1 The

CD supports X 50 , but if anyone needs more slaves, Im willing to make a new version that allows it.

2 A SAMPLE SESSION

You need to make sure that ALL network card types in the cluster are rolled into the PXE boot conguration. I need to add via-rhine in my example.

For ordinary clustering, only the textmode option is needed. Be sure not to select secure. 2
little warning bell went off in your head just now. This is NOT a secure setup. Anyone who can get into the cluster do anything they want to to any computer in the cluster, and up above I told you the root password. The cluster should not be connected to other networks, and only trusted users should be allowed to work with it. Or, have a good set of backups.
2 Hopefully a

2 A SAMPLE SESSION

You may need some special options to get your nodes to boot. I just click on the default myself:

Now you should turn on the slave nodes and let them boot using PXE. Have a cup of coffee, or whatever, while they boot.

2 A SAMPLE SESSION

Now, you need to select a partition on the master nodes HD on which to create a working directory. The working directory will be named parallel_knoppix_working to minimize the chances that an existing directory has the same name. The script will not let you use NTFS partitions, so it thats all you have, you will encounter a problem here. Please go to the top of this document and read more carefully if you have this problem. Otherwise, select a partition, and dont forget which one you choose, well need to know that in a moment.

2 A SAMPLE SESSION

Now we encounter this message. You need to make sure the slaves are booted before continuing, otherwise they will not have the shared working directory on them, and things wont work later.

To check if a slave is booted, you can open up a Konsole and attempt to ssh into it. When you can, its booted. When all slaves are booted, click OK on that previous dialog box.

2 A SAMPLE SESSION

10

Now, remember which partition has your working directory on it, and click appropriately:

To export the working directory, we need to know how many nodes there are.

2 A SAMPLE SESSION
Now we re up LAM. Enter (again) the total number of nodes in the cluster. 3

11

If youre here, congratulations, the cluster is working. Note the list of nodes in the background terminal window, thats proof that all went well.

3 Note

to self: combine these last two in the script.

2 A SAMPLE SESSION

12

2.3 Using it
Copy something to work on into the working directory. Ill use the MPITB Monte Carlo example here. For you non-KDE-heads out there, heres a drag-n-drop copy from ./Examples to the working directory:

Open up a terminal in the ./parallel_knoppix_working/mpi_work/montecarlo directory, and type octave to start GNU Octave. In octave, type tracetest_MC to do a Monte Carlo study of the tracetest.m function.

3 EXTENSIONS

13

O happy day, the output of a program run in parallel!

3 Extensions
To use ParallelKnoppix to execute programs that are not on the CD, they just need to be copied into the directory ~/Desktop/parallel_knoppix_working. This directory is

4 CONCLUSION

14

by default empty, but it is possible to mount an existing hard drive partition there, or les may be copied in across the network or from a USB storage device, for example. Advanced users can also use NFS exports from computers that are not in the cluster. Hint: the passwords for the root user and the knoppix user are both parallelknoppix. With that you can use scp, ssh, sh, etc. If the CD does not contain needed libraries or applications, the CD itself can be modied to create a personalized version. Documentation that explains how this may be done, and scripts that largely automate the process are included in the Remastering subdirectory inside the ParallelKnoppix directory on the desktop. Since ParallelKnoppix is based upon Debian Linux, installation of packages is very simple using the apt-get system, and there is a very extensive amount of pre-compiled software available. If you have a nice self-contained example, consider sending it to me for inclusion on the CD in the examples directory. It is worth emphasizing again that ParallelKnoppix gives the user complete control over all of the nodes of the cluster. A user can easily delete or modify data on any hard disk partition of any of the nodes. As such, administrators should not let untrusted users work with it. It would also be advisable to have disk images or some other backup of all nodes available, in case a disastrous mistake is made. ParallelKnoppix provides a very easy means of creating a cluster. The ease of setup is obtained largely at the expense of security.

4 Conclusion
The ParallelKnoppix CD provides a very simple and rapid means of setting up a cluster of heterogeneous PCs of the IA-32 architecture. It is not intended to provide a stable cluster for multiple users, rather is is a tool for rapid creation of a cluster. The CD itself is personalizable, and the conguration and working les can be re-used over time, so it can provide a long term solution for an individual user. I welcome comments, suggestions and contributions from anyone who uses this.

References
[1] Creel, Michael (2004), ParallelKnoppix - Create a Linux Cluster for MPI Parallel Processing in 15 Minutes, http://pareto.uab.es/mcreel/ParallelKnoppix/. [2] Knopper, Klaus (undated), KNOPPIX - Live Linux http://www.knopper.net/knoppix/index-en.html. Filesystem on CD,

[3] LAM team (2004), LAM/MPI Parallel Computing, http://www.lam-mpi.org/. [4] Message Passing Interface Forum (1997), MPI-2: Extensions to the Message-Passing Interface, University of Tennessee, Knoxville, Tennessee. [5] Gropp, W., E. Lusk, N. Doss and A. Skjellum (1996), "A high-performance, portable implementation of the MPI message passing interface standard", Parallel Computing, 22, 789828, see also http://www-unix.mcs.anl.gov/mpi/mpich/. [6] Top500 group (2003), Top 500 http://www.top500.org/list/2003/11/. Supercomputer Sites,

REFERENCES
[7] Vandermissen, W. (2004), http://bofh.be/clusterknoppix/. ::ClusterKnoppix -

15 Main,