Managing jobs using Slurm Workload Manager

On HPC clusters computations should be performed on the compute nodes. Special programs called resource managers, workload managers or job schedulers are used to allocate processors and memory on compute nodes to users’ jobs. On Condo the Slurm Workload Manager is used for this purpose. Jobs can be run in interactive and batch modes. When executing/debugging short-running jobs using small numbers of MPI processes, interactive execution instead of batch execution may speed up the program development. To start an interactive session for an hour, issue:

salloc -N1 -t 1:00:00

Your environment, such as loaded environment modules, will be copied to the interactive session. It's important to issue

exit

when you're done, so that resources assigned to your interactive job can be freed and be used by other users.

However when running longer jobs, the batch mode should be used instead. In this case a job script should be created and submitted into queue by issuing:

sbatch <job_script>

The job script will contain Slurm settings, such as number of cores and time requested, and the commands that should be executed during the batch session. Use Slurm Job Script Generator for Condo to create job scripts.

In Slurm queues are called partitions. Only partitions for special nodes (such as fat, huge, GPU) need to be specified when submitting jobs. Otherwise Slurm will submit job into a partition based on the number of nodes and time requested. The Script Generator will automatically add appropriate partition to the generated job script if accelerator nodes were selected.

To see the list of available partitions, issue:

sinfo

For more details on partitions limits, issue:

scontrol show partitions

To see the job queue, issue:

squeue

To cancel job <job_id>, issue:

scancel <job_id>

To see job usage for a finished job <job_id>, issue:

seff <job_id>

When an interactive or batch job is submitted to Slurm, the workload manager places job in the queue. Each job is given a priority which may change during the time the job stays in the queue. We use Slurm's fair-share scheduling on clusters at ISU. Job priority depends on how much resources had been used by the user or user's group, group's contribution to the cluster and how long the job has been waiting for resources. In accordance with job priority and amount of resources requested versus available, Slurm decides which resources to allocate to jobs.

You can use slurm-usage.py command to see your group usage. To see available options issue "slurm-usage.py -h" command.