Ex 9.) Batch jobs: Arrays & Dependencies

You are here:
Estimated reading time: 2 min

Introduction

Arrays

The best and recommended way to submit many jobs (>100) is using SLURM’s jobs array feature. The job arrays allow managing big number of jobs more effectively and faster.

To specify job array use --array as follows:

Tell SLURM how many jobs you will have in array:

  • --array=0-9. There are 10 jobs in array. The first job has index 0, and the last job has index 9.
  • --array=5-8. There are 4 jobs in array. The first job has index 5, and the last job has index 8.
  • --array=2,4,6. There are 3 jobs in array with indices 2, 4 and 6.

Now you can write a job submission script that looks like:

#!/usr/bin/env bash
#
#SBATCH --job-name=training_batch
#SBATCH --partition=training.q
#SBATCH --cpus-per-task=1
#SBATCH --time=1:00
#SBATCH --output=test_%A_%a.out
#SBATCH --array=1-3

echo $SLURM_JOB_NAME
echo $SLURM_ARRAY_TASK_ID

 

Dependencies

Often we develop pipelines where a particular job must be launched only after previous jobs were successully completed. SLURM provides a way to implement such pipelines with its --dependency option:

  • --dependency=afterok:<job_id>. Submitted job will be launched if and only if job with job_id identifier was successfully completed. If job_id is a job array, then all jobs in that job array must be successfully completed.
  • --dependency=afternotok:<job_id>. Submitted job will be launched if and only if job with job_id identifier failed. If job_id is a job array, then at least one job in that array failed. This option may be useful for cleanup step.
  • --dependency=afterany:<job_id>. Submitted job wil be launched after job with job_id identifier terminated i.e. completed successfully or failed.

Exercise

Copy and paste the above example into a file called “array-job.sh” in your $HOME.  Submit the job:-

cd $HOME; sbatch array-job.sh

Look for and examine the output files to check the content. Modify the script to run 6 array jobs and check the status on the queue. (squeue -r or scontrol show job jobid)

The output should be similar to :-

Now submit the script to run 100 array jobs but with only 5 at a time.

sbatch --array [1-100]%5 array-job.sh

Check the status of your job.

If some of your jobs are still queued cancel the remainder.   Check the output files to see how many of the array jobs ran.

scancel  jobid

Finally, submit a batch job to run an array of 1-10 and submit another batch job of array 11-20 which will only run if the first array jobs complete successfully.

sbatch --array [1-10] array-job.sh

sbatch --dependency=afterok:jobid --array [11-20] array-job.sh

Check the status of your job on the queue. Did the first array of jobs work and has the second array run?

 

Was this article helpful?
Dislike 0
Views: 2397