Ex 8.) Submitting a batch job

You are here:
Estimated reading time: 1 min

Introduction

In this exercise we will run our simple R commands as a batch job.  As with the interactive job in the previous exercise we use the sbatch command. Full documentation on sbatch can be found in the system’s man pages (man sbatch) or its online documentation. The usual way to submit a batch job is to create a job script. The lines starting with “#SBATCH” are called directives and map to the sbatch arguments that could be used on the command line. Here is an example job script:-

#!/usr/bin/env bash
#
#SBATCH --job-name=training_batch
#SBATCH --partition=training.q
#SBATCH --ntasks=1
#SBATCH --time=1:00

echo $SLURM_JOB_NAME
echo "Current working directory is `pwd`"
echo "Starting run at: `date`"

module purge
module load system/intel64
module load R/3.6.3

srun R -f training/src/square.r

# output how and when job finished
echo "Program finished with exit code $? at: `date`"
# end of jobscript

The job will create an output file and an error file. These will be created in the working directory by default.

Exercise

Cut and paste the above example into a file called “batch-job.sh” in your $HOME . Submit the job:-

cd $HOME; sbatch batch-job.sh

Use the squeue command to confirm the status of the job.

Look for and examine the output file and check its content. Try to modify your submission script to tell SLURM to write the output and errors (technically, the stdout & stderr streams) into separate files and to store it into the training/logs folder.

Look in the sbatch documentation to find out how to resubmit the job on hold. The job will appear in the queue with status PD and the reason (JobHeldUser).

Release the job (hint: scontrol) and confirm it runs.

Was this article helpful?
Dislike 0
Views: 186