Ex 6.) Examine the job queues

You are here:
Estimated reading time: 1 min

Introduction

The login servers are configured to run relatively small programs. These include editors, browsers, small compilations etc. Larger programs must be executed on a compute node ( or compute server ). Work is submitted to the compute nodes as a “job” via a “queue” . Once a job is submitted the “scheduler” and “resource” managers will start the job on the most suitable compute node(s) when appropriate. Often this is immediate but if the load is heavy the job will be “queued” to be executed at a later time when the required resources become available. Each partition has its own queue.

Exercise

The commands to examine the partitions and associated job queues are called sinfo and squeue. Additionally, sstat can be used to display various status information of a running job. Use their man pages to find out how to do the following :-

  • Find out the state of the nodes in each partition (are there any dysfunctional nodes? If so, what are the reasons?)
  • Show all running jobs.
  • Show all queued jobs just for a specific user.
  • Show the nodes being currently used by (any) specific running job.
  • Show a summary of each of the queues.
  • Show all running & queued jobs for a (any) specific queue.
Was this article helpful?
Dislike 0
Views: 240