Home Page
Getting Started
enFuzion Manual
Case Studies
Administration
Performance Monitor
 

Running PBS on Mahar

About

Mahar has PBS installed in a shared mode. That is, PBS will start a your jobs instantly regardless of load on cluster. The only restriction is PBS will only run as many jobs for the user as there are active nodes. So, if there are 42 nodes running, PBS will run up to 42 jobs per user.

The advantage of using PBS is that it will allocate the lowest load nodes first.

Quick tips

There are a few commands to use PBS easily. They are:
  • pbsrun: Use this command as a prefix to the command you want to run. E.g., go into the directory of your source code that you want to complile and insteead of typing make you can do pbsrun make. PBS will then use the node with the least load to compile your source code.
  • qsub -I: This command will give you a shell on the node with the least load.
  • qstat -u <username>: will show your jobs and their status.
  • qdel <jobname>: will kill and delete your job from the queue.
  • xpbs: An x-windows tools to help you with your PBS jobs.

One other tip, that is not directly related to PBS, but is worth mentioning: Every node on the cluster has a very fast scratch disk at /scratch. Try to use this directory as much as possible. This directory is only acessible from the individule nodes. It is not shared by each node but is very fast. Also, try to clean up after yourself aswell.

Using PBS submission scripts.

Here is a quick and easy script called pbs.uname.sh to allow you to start creating your own pbs submission scripts.

#!/bin/sh
#PBS -S /bin/sh

uname -a

To submit this script into PBS do the command qsub pbs.uname.sh.

This gives you no advantage over using pbsrun, but it is the start to creating a larger script that can do a lot more than pbsrun. This may include job dependencies and multi-CPU jobs (see section on MPI below).

The full manual for PBS can be found here

Using PBS with MPI.

It is highly recomended to use MPI with PBS as PBS will automatically allocate the nodes with the least load to run your jobs quickly. The way PBS has been installed on Mahar is to start your jobs quickly and will only queue if you have more jobs than there are Mahar nodes. If you have a number of experiments to run, PBS will queue jobs for you giving you a faster results turn around.

Step 1:

Email cme@csse.monash.edu.au and ask to be added to the "PBS" group.

Step 2:

Create a submission script like this. E.g. "pbs.mpi.sh":
#!/bin/sh
#PBS -S /bin/sh

mpirun -machinefile $PBS_NODEFILE -np `cat $PBS_NODEFILE | wc -l` ./a.out
Replace a.out with the name of your mpi program.

Step 3:

Submit using this command:
qsub -l nodes=30 -W group_list=pbs@mahar pbs.mpi.sh
Replace the number of nodes with the number you require.

Step 4:

To see the status of the job, use:
qstat -n -u <Your username>
If things are going wrong, you can stop the job with this command:
qdel <jobID>
When the job is finished, standard error and out will appear as:
pbs.mpi.sh.e<jobID>
pbs.mpi.sh.o<jobID>