Vous êtes sur la page 1sur 8

User Guide of High Performance Computing Cluster

in School of Physics

Prepared by Sue Yang (xue.yang@sydney.edu.au)

This document aims at helping users to quickly log into the cluster, set up the software
environment and get their jobs run. It does not cover clusters hardware and operation system
(eg. linux shell, file systems, etc.), nor does it cover the usage of software development tools and
parallel programming. If required, those topics may be included in future.

Connecting to the cluster


The full host name of the cluster is

headnode.physics.usyd.edu.au

You can use secure shell (ssh) to connect to the cluster (using either short name or full name):

~ > ssh headnode

Software environments
You often need to set up software environment for a program you wish to use on a computer
system. For example, add PBS to your search PATH. This can be specified in your login shell
script file, .cshrc or .bashrc. If you have done it, you can keep using it. Otherwise, you are
encouraged to use Environment Modules package for this purpose.

The package provides a great way to easily customize your shell environment, especially on the
fly. To find the list of software for which your environment needs to be set up for using the
software, enter module avail,

headnode: ~ > module avail

----------------------------- /usr/physics/Modules/3.2.8/modulefiles ------------------------------

IntelCompilerSuite PBS ROOT-v5.28 openMPI-1.6.5-gnu openMPI-1.6.5-intel

headnode: ~ > module whatis PBS

PBS : Sets up torque 2.4.7 and maui 3.3.1 in your enviornment

You can set an environment on a fly eg. for PBS,

headnode: ~ > module load PBS

or add this line in your .cshrc permanently for PBS,

1
module load PBS

So that, each time when you login, configuration for PBS will be done for you. You can then run
PBS commands such as qsub, qstat etc. or man qsub to get information about qsub.

You may need to unload a package before loading another one to avoid conflict. For example,
you have already set up openMPI with Intel compilers and now want to use it with GNU
compilers, do this,

headnode: ~ > module unload openMPI-1.6.5-intel

headnode: ~ > module load openMPI-1.6.5-gnu

command module help or man module should give you some information about how to use
Environment Modules package.

Workload Management System (PBS)


This section covers the topics:

What is PBS; Basic PBS user commands; Available queues on the cluster; Job submission and
Job Script Template; Tips for specifying resources; Additional job script templates; Monitoring
jobs; Interactive jobs, Array jobs

What is PBS

PBS is a distributed workload management system. As such, PBS handles the management of
computational workload on a set of compute nodes. PBS plays three primary roles: queuing,
scheduling and monitoring jobs. From the user's perspective, PBS allows you to make more
efficient use of your time. You specify the tasks you need executed. The system takes care of
running these tasks and returning the results to you. If the available compute nodes are full, then
PBS holds your work and runs it when the resources are available.

you create a batch job which you then submit to PBS. A batch job is a file (a shell script)
containing a set of commands you want to run on a set of execution machines. It also contains
directives which specify the characteristics (attributes) of the job, and resource requirements (e.g.
number of processors, amount of memory and length of time) that your job needs. Once you
create your PBS job, you can reuse it if you wish. Or, you can modify it for subsequent runs.

Basic PBS user commands

headnode: ~ > qsub run-job.csh

job submission with job script file: run-job.csh

headnode: ~ > qstat -u uid

2
display job status for user uid only

headnode: ~ > qstat -n

display all jobs status

headnode: ~ > qstat -Q

shows various queue types

headnode: ~ > qstat -f job-id

display status details for the specified running job

headnode: ~ > qdel job-id

delete job: job-id

all PBS client commands are in headnode:/usr/physics/torque/bin. Use man page for detail usage of
each command.

Available queues on the cluster

Queue name for all physics users (jobs will run on node 01-16 and 31- 35):

physics

Queue name for Complex Systems users (jobs will run on node 21- 23):

yossarian

Queue name for Medical Physics users (jobs will run on node 01-16 and 31- 35):

hippocrates

Queue name for Condensed Matter Theory users (jobs will run on node 41 - 45):

cmt

Job submission and Job Script Template

job submission is done by running PBS command qsub,

headnode: ~ > qsub run-job.csh

where run-job.csh is a batch job which contains qsub options and commands/programs that you
want to run. Here is what an example run-job.csh has:

#!/bin/csh
#PBS -N MyJobName
#PBS -o demo.txt
#PBS -j oe
3
#PBS -q yossarian
#PBS -l nodes=1:ppn=4
#PBS -l walltime=00:01:00
#PBS -m ea
#PBS M username@physics.usyd.edu.au
#PBS -V
cd "$PBS_O_WORKDIR"
#your commands/programs start here, for example:
hostname
exit

If you submit this job, it will generate a file demo.txt with the hostname of the node it ran on
printed. The output may contain harmless TTY warnings related to using tcsh rather than bash.
Notes of above example run-job.csh are as follows:

#!/bin/csh
This indicates it will run C shell.

Lines starting with #PBS are options of PBS command qsub.

-N MyJobName
The name for your job
-o demo.txt
The filename to write standard output from your job
-j oe <optional>
Merges stdout and stderr into the output file. Otherwise, PBS will automatically create a
separate error log
-q yossarian
Select which PBS queue to use. Use the queue corresponding to your group
-l nodes=1:ppn=4
Specify the CPU resources required, 4 processors on 1 node specified here.
-l walltime=00:01:00
maximum wall time requested to run job, 1 minute specified here. Warning: if the job
hasnt finished when the time reaches this walltime, you job will be killed.
-m ea
Sends a notification email when job ends/aborts.
M username@physics.usyd.edu.au
your email address specified here.
-V
Declares that all environment variables in the qsub commands environment are to be
exported to the batch job. If this directive is missed, your job may be terminated because
eg. $TERM is not set.
cd "$PBS_O_WORKDIR"
change to the directory where (the variable $PBS_O_WORKDIR contains the path from
which) you submit this file.

Tips for specifying resources

4
The main cluster resources are compute nodes and processors, memories and execution time.
Multiple users share the resources on the cluster. The general advice is to request resources as
accurate as your job need. As seen above, you specify the number of nodes and number of
processors with the option l nodes=*:ppn=*. nodes designates how many nodes your job should
be executed on, and ppn specifies the number of processors that will be allocated one each node.
For example,

-l nodes=1:ppn=1 1 processor on 1 node. This is what you should use for a non-parallel program
-l nodes=2:ppn=1 1 processor per node, for a total of 2 processors
-l nodes=1:ppn=14 14 processors on 1 node. This option will cause the queue to reject your job
because no nodes have enough processors

PBS will reserve the number of nodes and processors you have specified for your job no matter
how many processors your job actually run on. These nodes and processors will not be given
any new tasks when your job is running. On the other hand, if you request l nodes=1:ppn=1 for
a Matlab job which uses a matlabpool of size 8 (it will run on 8 processors), PBS wont know
your matlab program uses 8 processors and may assign some processors on the node to other
jobs. Your job and the other jobs will share 7 processors and this will cause all the jobs to slow
down. Therefore, it is important that you request correct number of nodes and processors for
your job.

Each node has about 32GB swap space, which means that when jobs use up all physical
memories, memory swapping will occur to keep jobs running. Memory swapping will slow
down all jobs running on the node, too. You can reserve certain physical memory by specifying
l mem=??MB or l mem=??GB (maximum amount of physical memory used by the job) to
avoid using swap space. For example,

-l mem=3GB to reserve 3GB physical memory for your job

A little trial and error may be required to find how much memory your job is using. Your job
will only run if there is free memory as sufficient (more than 3GB in above example) as
requested so making a sensible memory request will allow your job to run sooner. If your job
needs memory more than what you have specified, the job will terminates when it reaches mem.
Users may reserve more memory for his job by simply requesting all/more processors instead of
specifying requested memory size on a node. It is ok, but actually blocks jobs of other users with
less memory usage to run.

It is recommended that you use l walltime=* instead of l cput=* to specify how much time your
program is allowed to run for. walltime literally refers to wall time, the amount of time that a
clock on the wall shows (as opposed to CPU time, the time all processors actually spends on a
task). After it reaches walltime, your job will be terminated by PBS. It is always best to make
as accurate this request as possible.

Additional Job Script Templates

Job script for running Matlab program:

5
#!/bin/csh
#PBS -q physics
#PBS -l walltime=1:00:00
#PBS -l nodes=1:ppn=4
#PBS -V
#PBS -N test-matlab
#PBS -m ea
#PBS -M firstname.lastname@sydney.edu.au
#PBS -j oe
#PBS -o output.txt
cd ${PBS_O_WORKDIR}
# run matlab file yourMatlabScripts.m:
matlab -nodisplay -r "yourMatlabScript, exit"

Job script for running MPI program:

#!/bin/csh
#PBS -q physics
#PBS -l walltime=10:00:00
#PBS -l nodes=4:ppn=2
#PBS -V
#PBS -N test-mpi
#PBS -m ea
#PBS -M firstname.lastname@sydney.edu.au
#PBS -j oe
#PBS -o output.txt
cd "$PBS_O_WORKDIR"
mpirun -n 8 yourMPIcode # n = nodes x ppn (see resource request)
exit

Monitoring jobs

Use command qsub n to view all submitted jobs status. Alternatively, you can monitor
execution of your job by using qload or qtop.

By default, qload shows you a list of all jobs currently in the queue, a summary of which users
are using the system and information on workload over the cluster. For example,

headnode:~ > qload


Job ID Job name Owner Queue N/CPU Time remaining Status
235385 Sensor sxy cmt 5/25 1h 00m 00s Running

USER LOAD
1- SXY/XUE YANG (25 CORES) 1h 00m 00s remaining

AVAILABILITY: Medical 0/0, Complex 0/0, CMT 0/0


node01 |------------| (12) 0.0 30.5GB node14 |------------| (12) 0.0 30.6GB
node02 |------------| (12) 0.1 30.6GB node15 |------------| (12) 0.0 30.5GB
node03 |------------| (12) 0.1 30.5GB node16 |------------| (12) 0.0 30.5GB
node04 |------------| (12) 0.0 30.6GB node31 |------------| (12) 0.0 30.5GB
node05 |------------| (12) 0.0 30.6GB node32 |------------| (12) 0.0 30.6GB
node06 |------------| (12) 0.0 30.5GB node33 |------------| (12) 0.0 30.5GB
node07 |------------| (12) 0.0 30.5GB node34 |------------| (12) 0.0 30.5GB
node08 |------------| (12) 0.0 30.5GB node35 |------------| (12) 0.0 30.6GB
node09 |------------| (12) 0.0 30.6GB node41 |xxxxx-----------| (11) 4.9 123.9GB
node10 |------------| (12) 0.0 30.6GB node42 |xxxxx-----------| (11) 5.0 123.8GB
node11 |------------| (12) 0.0 30.5GB node43 |xxxxx-----------| (11) 5.0 124.1GB

6
node12 |------------| (12) 0.0 30.5GB node44 |xxxxx-----------| (11) 5.0 124.2GB
node13 |------------| (12) 0.0 30.5GB node45 |xxxxx-----------| (11) 4.9 124.1GB
node21 |-------------------------------| (31) 0.0 186.8GB
node22 |-------------------------------| (31) 0.0 186.8GB
node23 |-------------------------------| (31) 0.0 186.8GB

Your jobs are colored red in the node availability report, so you can see which nodes your job is
running on.

Several switches are available for qload

-a view jobs from all users, not just yourself


-u USER view jobs from a different user, and highlight their jobs instead of yours. If you
combine u and -a it will show jobs from all users, with highlights from the user
specified with -u.
-s only show a summary of node availability (to quickly check available resources)

If you want to delete your job before it finishes, use the qdel command and provide your Job ID
from qload. To remove the job Sensor owned by sxy as shown above, user sxy would run,

sxy@headnode: ~ > qdel 235385

Interactive jobs

You can start an interactive session via PBS by using qsub I. This will create an interactive
job, and you will be given a shell on a compute node as though you had used ssh. For example:

headnode: ~ > qsub -I q physics

qsub: waiting for job 2945.headnode.physics.usyd.edu.au to start

qsub: job 2945.headnode.physics.usyd.edu.au ready

node02: ~ >

This is ideal for compiling code and testing.

When using an interactive job, you can specify the number of nodes and CPUs to lock out
(although requesting a number of nodes for an interactive job is only useful if you are going to be
using mpirun). For example

user@headnode: ~ > qsub I l nodes=1:ppn=8

would start an interactive job that locks out an entire node. Interactive jobs will also appear in
qload. Please do not use interactive jobs to perform unattended runs (e.g. with batch or screen).
Interactive jobs are ONLY for attended interactive use. By default, interactive jobs will terminate
after 1 hour. You can set the walltime variable with l flag to increase this, just the same as in the
PBS script file. Please do not start interactive jobs with excessive walltime requests.

7
Array jobs

Array jobs are one of the most powerful features of PBS for single-CPU jobs, are a very
compelling reason for many users to learn and switch to the PBS system. They are useful when
you want to run the same program many times, operating on different input files or with different
input arguments. Array jobs allow you to quickly submit all of the jobs at once, and will run
several instances of your job at the same time. For example, suppose I had a directory with files
data1.csv, data2.csv and data3.csv, and I wanted to run my program myprog FILE on each of
them. I can do this very easily using the -t option
#!/bin/csh
#PBS -N MyJobName
#PBS -o demo.txt
#PBS -q yossarian
#PBS -l nodes=1:ppn=4
#PBS -l walltime=00:01:00
#PBS -m ea
#PBS M username@physics.usyd.edu.au
#PBS -V
#PBS -t 1-3
cd "$PBS_O_WORKDIR"
myprog data${PBS_ARRAYID}.csv

The -t switch instructs PBS to submit this as an array job. You can specify a range of indices (1-
3) or individual indices (1,3,5). For each index, PBS creates a separate job. Submitting this script
will cause 3 job to be created, each of them requesting 4 CPUs on 1 node. The variable
$PBS_ARRAYID stores the value of the array index in each submitted job. So each of the 3 jobs
will run with a different value of $PBS_ARRAYID. In this way, myprog will run on each of the
3 data files, even though only one script was submitted to PBS. You can of course do fancier
things with the index, like use more sophisticated scripting to operate on the array ID before
calling your program etc.

Another useful way to use the array ID is as an argument to a Matlab function. For example, if
the command in the PBS script was

matlab -nodisplay -r "mymatlabscript(${PBS_ARRAYID});exit"

then mymatlabscript.m would be run for each of the different array ID values. You can then
write code in Matlab to decide what each of the array ID values will do.

Vous aimerez peut-être aussi