Managing jobs using Slurm Workload Manager

On HPC clusters computations should be performed on the compute nodes. Special programs called resource managers, workload managers or job schedulers are used to allocate processors and memory on compute nodes to users’ jobs.  On Nova the Slurm Workload Manager is used for this purpose.  Jobs can be run in interactive and batch modes.  When executing/debugging short-running jobs using small numbers of MPI processes, interactive execution instead of batch execution may speed up the program development.  To start an interactive session for an hour, issue:

salloc -N1 -t 1:00:00

Your environment, such as loaded environment modules, will be copied to the interactive session. It's important to issue


when you're done, so that resources assigned to your interactive job can be freed and be used by other users.

However when running longer jobs, the batch mode should be used instead.  In this case a job script should be created and submitted into queue by issuing:

sbatch <job_script>


The job script will contain Slurm settings, such as number of cores and time requested, and the commands that should be executed during the batch session. Use Slurm Job Script Generator for Nova to create job scripts.

In Slurm queues are called partitions. Only partitions for special nodes (such as fat, huhe, GPU) need to be specified when submitting jobs. Otherwise Slurm will submit job into a partition based on the number of nodes and time requested. The Script Generator will automatically add appropriate partition to the generated job script if accelerator nodes were selected.


To see the list of available partitions, issue:


For more details on partitions limits, issue:

scontrol show partitions


To see the job queue, issue:


To cancel job <job_id>, issue:

scancel <job_id>


To see job usage for a finished job <job_id>, issue:

seff <job_id>


When an interactive or batch job is submitted to Slurm, the workload manager places job in the queue. Each job is given a priority which may change during the time the job stays in the queue. We use Slurm's fair-share scheduling on clusters at ISU. Job priority depends on how much resources had been used by the user or user's group, group's contribution to the cluster and how long the job has been waiting for resources. In accordance with job priority and amount of resources requested versus available, Slurm decides which resources to allocate to jobs. 

You can use command to see your group usage. To see available options issue " -h" command.