How do I run jobs on SCIAMA with SLURM?

You are here:
Estimated reading time: 2 min

Running jobs on an interactive shell

To top

Sometime users may want to run a program on SCIAMA and interact with it or its graphical user interface respectively (e.g. for mathematica). For that you can create an interactive shell on a compute node:

sinteractive [<options>]

e.g.

sinteractive -n2 -c4 -t 1:00:00

which allocates 2 tasks with 4 cores each to run the interactive job (sinteractive takes the same arguments for resource allocation as e.g. salloc and srun)

Submit/run a batch job to/on SCIAMA

To top

In most cases, computation-heavy codes do not require any interaction with the user. They simply run on their own until the results are calculated and written back to the disk.

To submit such a batch job use the command qsub followed by the name of your jobscript:

sbatch <path/to/jobscript>

To check on the status of a job use the command qstat. Using the -u option allows you to see only the jobs you have submitted:

squeue -u <username>

To cancel a job use qdel and the job number (listed in the output of qstat):

scancel <jobnumber>

Example jobscripts

To top

This is an example jobscript for a simply serial job that could be submitted on Sciama using the qsub command:

#!/bin/bash
# This is an example job script for running a serial program
# these lines are comments
# SLURM directives are shown below

# Configure the resources needed to run my job, e.g.

# job name (default: name of script file)
#SBATCH –job-name=my_job
# resource limits: cores, max. wall clock time during which job can be running
# and maximum memory (RAM) the job needs during run time:
#SBATCH –ntasks=1
#SBATCH –time=1:30:00
#SBATCH –mem=8G
# define log files for output on stdout and stderr
#SBATCH –output=some_output_logfile
#SBATCH –error=some_error_logfile
# choose system/queue for job submission (default: sciama2.q)
# for more information on queues, see related articles
#SBATCH –partition=training.q

# set up the software environment to run your job in
# first remove all pre-loaded modules to avoid any conflicts
module purge
# load your system module e.g.
module load system/intel64
# now load all modules (e.g. libraries or applications) that are needed
# to run the job, e.g.
module load intel_comp/2019.2
module load python/3.7.1

# now execute all commands/programs that you want to run sequentially, e.g.
cd ~/some_dir
srun python do_something.py
srun ./myapplication ‐i ~/data/some_inputfile

Don’t confuse a comment “#” with a queuing system directive “#SBATCH“.

And an example job script for running a parallel code (MPI and/or MultiThread) on more than one core on SCIAMA:


#!/bin/bash
# This is an example job script for running a parallel program

#SBATCH –job-name=my_parallel_job
# define number of nodes/cores needed to run job
# here: 65 nodes with 12 processes per node (=780 MPI processes in total)
#SBATCH –nodes=65
#SBATCH –ntasks-per-node=12
#SBATCH –time=1:30:00
#SBATCH –output=some_output_logfile
#SBATCH –error=some_error_logfile
#SBATCH –partition=training.q

module purge
# don’t forget to load the system module & compiler/MPI library you compiled your code with, e.g. OpenMPI
module load system/intel63
module load intel_comp/2016.2
module load openmpi/4.0.1

# now execute all commands/programs that you want to run sequentially
# on each allocated processor; for MPI applications add ‘–mpi=pmi2’ (for OpenMPI)
cd ~/some_dir
srun –mpi=pmi2 ./my_parallel_application ‐i ~/data/some_inputfile

Some more information about #SBATCH commands that can be used in jobscripts can be found e.g. HERE
Was this article helpful?
Dislike 0
Views: 247