Rackable Linux Cluster of Opterons (rcluster)
Running Jobs on the rcluster
- Using the Batch Queues
- Batch Queues on the rcluster
- LSF Usage Information
- Submitting a Batch Job to the Queue
- Checking the Status of Jobs
- Files Created at Job Start
- Canceling/Removing a Job
- Receiving an Email when Job Terminates
- Runchaining Jobs
- Running an Interactive Job
Using the Batch Queues
Jobs of over ten (10) minutes duration must be submitted to the queues rather than run in the background or interactively on the login node rcluster.rcc.uga.edu. Background jobs and interactive commands including cron jobs, at, $ and nohup processes as well as commands entered at the keyboard will be terminated after 10 minutes of cpu time. Graphical front ends to programs, programming tools, etc. will not be terminated.
The queueing system being used on the rcluster is Platform LSF.
Batch Queues on the rcluster
The rcluster machines have 2 CPUs each. Some of them have "dual-core" (as opposed to "single-core") CPUs, which means they behave as though they had 4 CPUs rather than 2.
Queue names beginning with "s" followed by a number submit jobs specifically to single-core machines. Queue names beginning with "d" submit jobs specifically to dual-core machines. Queues whose names start with "r" don't care which type of machine they send jobs to. Queues defined specifically for the IOB begin with "iob" and queues defined specifically for the Statistics Department begin with "stat".
- Multi-thread jobs submitted to single-core machines (that is, queue names starting with "s" or "iob-s") can have two threads and those submitted to dual-core machines (that is, queue names starting with "d" or "stat-d" ) can have up to 4 threads.
- A job might have slightly different performance on single-core and on dual-core processors. Therefore, for better load balance, we recommend that parallel MPI jobs be sent to the queues that target specifically either single-core machines or dual-core machines, and not to the queues whose names start with "r".
The batch queue can be used for serial jobs (that is, jobs that require only one processor) and for parallel jobs. The form of a queue name indicates how many processors it is limited to and the run time limit. For example, the queue r4-24h has a limit of 4 processors and 24 hours of run time per processor. To submit a job to the resource, first determine your processor number and time requirements. This will determine which queue you need.
Here are more examples of queue names:
r1-24h |
One processor, maximum run time of 24h, sends job to either single-core or dual-core machines |
r1-96h |
One processor, maximum run time of 96h, sends job to either single-core or dual-core machines |
r1-10d |
One processor, maximum run time of 10 days, sends job to either single-core or dual-core machines |
r4-24h |
Four processors, maximum run time of 24h per processor, sends job to either single-core or dual-core machines |
s4-24h |
Four processors, maximum run time of 24h per processor, sends job to single-core machines |
d4-24h |
Four processors, maximum run time of 24h per processor, sends job to dual-core machines |
iob-s16-10d |
Up to 16 processors, maximum run time of 10 days, sends job to single-core machines. For IOB's associate members only. |
iob-s32-10d |
Up to 32 processors, maximum run time of 10 days, sends job to single-core machines. For IOB's full members only. |
stat-d16-10d |
Up to 16 processors, maximum run time of 10 days, sends job to dual-core machines. For Statistics Dept. members only. |
For a list of all valid queue names, please use the command queuenames from a rcluster shell prompt.
We recommend that users checkpoint their codes whenever possible to avoid losing valuable compute time if the system goes down before a job is completed. A long job that can be checkpointed can be run as a sequence of shorter jobs, which can be automatically submitted to the queue as described below in the Runchaining Jobs section. If you cannot fit your job within the established processor and runtime limits, please let us know.
LSF Usage Information
These are the common LSF commands:
bsub |
Submit a job to the queue |
bkill |
Cancel a queued or running job |
bhold |
Place a queued job on hold |
bjobs |
Check the status of queued and running jobs |
bqueues |
List all valid queue names |
The preferred way to submit a batch job to the queue is to use the bsub command to submit a job submission shell script. The syntax of the bsub command is:
wherebsub -n nprocs -q queuename -o stdout -e stderr ./shellscriptname
nprocs is the number of processors (not required for serial jobs)
queuename is the name of the batch queue
stdout is the name of the file where the standard output is stored
stderr is the name of the file where the standard error is stored
shellscriptname is the name of the job submission shell script file
Examples:
1.To submit a serial job with script sub.sh to the r1-24h batch queue and have the standard output and error go to test.jobid.out and test.jobid.err, respectively, use
bsub -q r1-24h -o test.%J.out
-e test.%J.err ./sub.sh
2.To submit a 4-processor parallel job with script subp.sh to the r4-24h batch queue and have the standard output and error go to test.jobid.out and test.jobid.err, respectively, use
bsub -n 4 -q r4-24h -o test.%J.out
-e test.%J.err ./subp.sh
IMPORTANT NOTES:
-o test.%J.out -e test.%J.err in
the submission command), the standard output and error of the
job will be sent to you by email.chmod u+x sub.sh
Example of job submission shell scripts (sub.sh):
In the examples below, the executable name will be called myprog and
it requires input parameters to be piped in. The input parameters are in a file
called myin and the output data will be stored in a file called myout.
The working_directory is the path to your working directory (e.g., it
could be /home/labname/username/subdir or /scratch/username/subdir )
To run a serial job:
#!/bin/csh
cd working_directory
time ./myprog < myin > myout
To run a parallel MPI job using 4 processors (csh shell):
#!/bin/csh
cd working_directory
echo $LSB_HOSTS
cat /dev/null > mlist.$$
foreach variable ($LSB_HOSTS)
echo $variable >> mlist.$$
end
mpirun -np 4 -machinefile mlist.$$ ./myprog < myin > myout
rm -f mlist.$$
To run a parallel MPI job using 4 processors (bash shell):
#!/bin/bash
cd working_directory
echo $LSB_HOSTS
cat /dev/null > mlist.$$
for variable in $LSB_HOSTS; do
echo $variable >> mlist.$$
done
mpirun -np 4 -machinefile mlist.$$ ./myprog < myin > myout
rm -f mlist.$$
To run a parallel OpenMP job using 2 threads:
#!/bin/csh
cd working_directory
setenv OMP_NUM_THREADS 2
./myprog < myin > myout
NOTE: Do NOT put the job into the background with a '&' in the shell script. This will confuse the queueing system.
The file myin in the examples above is only necessary if your program requires standard input data and the file myout is only necessary if you want the standard output data (if any) to be stored in a separate file instead of the standard output file of the batch job (test.jobid.out in the example above). If your program does not require one or both of these files, you have to remove the corresponding piping symbols ( < and/or > ) in the last line of the scripts above.
MORE IMPORTANT NOTES:
1. MPI jobs executed with mpirun have to use the -machinefile option as shown in the examples above, otherwise your mpi job will not use the processors assigned to it by the queueing system. Using the script above for MPI jobs, a file called mlist.xxxxx containing a list of processors assigned to your job will be generated when your job starts running and it will be deleted when your job is done. The processors used for your job will be listed in the stdout.
2.When running threaded applications, please add the
bsub option
-R "span[hosts=1]" to ensure that all processors
assigned to your job (up to 4 when running on dual core machines and up to 2 when
running on single core ones) are on the same machine. Without this bsub option,
LSF might assign processors on different machines to your job.
Checking the Status of Jobs
Use the bjobs command to check the status of jobs:
bjobs [-u username]
[-l] [jobid]
where username is the user whose jobs you want to check and jobid is the JOBID of a specific job. The -l option gives long output, with detailed information about the job(s).
For example:bjobs -u
all |
shows all the jobs in the pool |
bjobs -u
johndoe |
shows all jobs for user johndoe |
bjobs -l
10407 |
gives detailed information about the job with JOBID 10407 |
Files Created at Job Start
If you submit your job with the -o mystdout -e mystderr options,
then the files mystdout and mystderr will be
created when your job starts running, unless they already exist.
In the latter case, the stdout and stderr of the job will be
appended to the corresponding files. If you would like to have the jobid number
incorporated into the stdout and stderr file names, use the special character %J
in these file names.
If the -o and -e options
are not specified at job submission, the stdout and stderr of
the job will be sent to you by email to your rcluster account and rcluster
will automatically forward it to the email address that you listed
when you requested your rcluster account (for example, your ugamail or departmental account). The sender of the email is
LSF. You might want to check whether your email server flags such messages
as spam and filter them out. To ensure that this does not happen, you might
want to whitelist messages sent by LSF.
Canceling/Removing a Job
Use the bkill command to cancel/remove a job from the job pool:
bkill [-u username] jobid [jobid]
For example:
bkill 10408 |
cancels your job with JOBID 10408 |
bkill 10408
10409 |
cancels your jobs with JOBIDs 10408 and 10409 |
bkill -u
your_user_id |
cancels all jobs you have in the queue |
Receiving an Email when Job Terminates
When you submit a batch job with bsub without the -o and -e options, you will receive the standard output and standard error of the job by email when the job terminates (whether it completes successfully or not). You can add the bsub option -N to have the standard output of the LSF job (not of the application) sent to you when the job terminates. The standard output of the application and the standard error of the job can still be written to files specified by the -o and -e options, respectively. For example:
bsub -n 4 -q r4-24h -o out.%J -e err.%J -N ./sub.sh
The 4 processor job running on the r4-24h queue will write the standard output of the application in the file out.jobid, write the standard error of the job in the file err.jobid, and it will send the standard output of the batch job (exit code, CPU time used, node used, etc) to the user's preferred email address.
Runchaining Jobs
We have found that a common need is to be able to run the same
job over and over. For instance when you need to do a large number
of iterations, you run so many and write in a data set the information
needed to restart the job where it left off. When the job is
restarted it reads the restart information and continues where
the previous execution left off.
To have one job automatically submit the next one once it finishes,
you can add the following lines at the end of your job submission
script:
bsub -n nprocs -q queuename -o stdout -e
stderr ./next_script_name
exit
Example: sub1.sh
In the examples below we assume that the executable myprog does not require any standard input. The working directory is assumed to be /home/labname/username/subdirectory.
#!/bin/csh
cd /home/labname/username/suddirectory
time ./myprog
bsub -q r1-24h -o sub.%J.out -e sub.%J.err ./sub2.sh
exit
Parallel job using csh (tcsh) :
#!/bin/csh
cd /home/labname/username/subdirectory
echo $LSB_HOSTS
cat /dev/null > mlist.$$
foreach variable ($LSB_HOSTS)
echo $variable >> mlist.$$
end
mpirun -np 4 -machinefile mlist.$$ ./myprog
rm -f mlist.$$
bsub -n 4 -q r4-24h -o sub.%J.out -e sub.%J.err ./sub2.sh
exit
First the script sub1.sh is submitted to the queue. Once it finishes running, it automatically submits script sub2.sh to the queue. This script can in turn submit sub3.sh to the queue when it completes, and so on. For this procedure, the user can prepare a sequence of scripts, which will then be submitted one at a time to the queue and run in sequence. Alternatively, the script sub1.sh can resubmit itself back to the queue once it finishes running. This would create an "infinite loop", a situation that is not recommended. To break the infinite loop, the user can set some termination rules for the job resubmission process.
Example of a termination rule:
One way to break out of an infinite job resubmission loop is to have the code generate a file when the program finally "converges" (or when it completes a predetermined number of steps, for example). Let us call this file finalresults.txt. The job submission script sub.sh checks whether the file finalresults.txt exists. If it does not, then the script sub.sh is submitted to the queue again, otherwise the script simply exits and the resubmission chain is terminated. A simple script sub.sh that accomplishes this is the following:
Serial job using csh (tcsh):
#!/bin/csh
cd /home/labname/username/subdirectory
time ./myprogram
if ( ! -e finalresults.txt ) then
bsub -q r1-24h -o mystdout -e mystderr ./sub.sh
endif
exit
Serial job using ksh (bash):
#!/bin/ksh
cd /home/labname/username/subdirectory
time ./myprogram
if [ ! -f finalresults.txt ]
then
bsub -q r1-24h -o mystdout -e mystderr ./sub.sh
fi
exit
Running an Interactive Job
We have set aside one dual core dual processor node (4 CPUs) called
inter1 for interactive jobs.
This node is not part of the queueing system. To access this node, first login to rcluster.rcc.uga.edu and from there use ssh to connect to inter1:
rcluster> ssh inter1
Your prompt on inter1 will not read inter1, it will read for example compute-2-12, or a similar name.
A single processor executable (a.out) can be run on inter1 as follows:
compute-2-12> ./a.out
Or run the code using 'nohup' in order to be able to logout without
interrupting the running job:
compute-2-12> nohup ./a.out &
To run a parallel MPI job interactively, first you need to create a file (for example, call it host.list) with the word 'inter1' (without the quotes) in it, repeated 4 times in a column. That is, the contents of host.list will be
inter1
inter1
inter1
inter1
Put this file (host.list) in your working directory and then run the MPI program a.out as follows (e.g. using 4 processors):
compute-2-12> mpirun -np 4 -machinefile host.list ./a.outcompute-2-12> nohup mpirun -np 4 -machinefile host.list ./a.out &Because this node has a total of 4 CPUs, users should not run parallel jobs that use more than 4 processors or threads.
This node should only be used for short jobs (for example, for debugging purposes) and for those that cannot be run on the batch queueing system (for example, if the job requires an X windows front-end). The load of the node can be monitored using top or w.

