Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

Please watch the Introduction to Slurm Tutorials - https://slurm.schedmd.com/tutorials.html

Running Your Jobs

(Quick start) Starting an interactive session using srun

To get started quickly, from a login node:

...

For more options and examples on how to use srun to run an interactive job, see https://slurm.schedmd.com/srun.html

Starting an independent interactive session using salloc

This command allocates a node, or collection of nodes, for your use. Basic usage:

...

indicating you are now on c002. You can now run jobs as you normally would. So why use salloc? The use of an ssh connection allows you to perform port forwarding, which is useful if you want to use jupyter notebooks.

(Recommended) Starting a batch job using squeue

Interactive jobs are good for testing and development, but production type jobs should be submitted using the squeue command to submit a run script. This will put your job in the scheduler and will automatically start it as soon as it can.

...

to get real-time output of your code by tracking the output file.

INCLINE Partitions

StatusPartition NameAccessResourcesMax # nodesMax timeCurrent Priority Job Factor
(higher number = higher priority) 
Description
EnabledcomputeAllCompute nodes2624h2This is the standard workhorse category of partitions for most HPC codes. Jobs submitted to these queues are reasonably high priority, but have a 24 hour time limit. 
EnabledgpuAllGPU nodes224h2
EnabledbigmemAllHigh memory nodes224h2
Disabledcompute-quickAllCompute nodes21h3These partitions are for testing or debugging. Submitting to these queues gets your code running quickly.
Disabledgpu-quickAllGPU nodes11h3
Disabledbigmem-quickAllHigh memory nodes11h3
Disabledcompute-longAllCompute nodes13720h1Use these partitions for long-time jobs that are expected to take multiple days or even weeks. These partitions are low priority but have a long runtime. 
Disabledgpu-longAllGPU nodes1720h1
Disabledbigmem-longAllHigh memory nodes1720h1

Disabled

compute-unlimitedPrivilegedCompute nodes only26Unlimited100These partitions are high-priority, unlimited queues accessible to privileged users only. Use of these queues is available by special request only. The unlimited queues are used for ultra-large-scale production runs, benchmarking tests, etc. 

Disabledgpu-unlimitedPrivilegedGPU nodes only2Unlimited100
Disabledbigmem-unlimitedPrivilegedHigh memory nodes only2Unlimited100
Disabledcompute-USERUSERCompute nodesNUnlimited100These partitions are for users who are the owners of individual nodes on INCLINE. For instance, if bobsmith is a PI who has paid to purchase a compute node, then compute-bobsmith is a special high-priority queue accessible to him and his designated users only. 
Disabledgpu-USERUSERGPU nodesNUnlimited100
Disabledbibmem-USERUSERHigh memory nodesNUnlimited100

For detailed discussion of SLURM prioritization and fairshare algorithm, see this presentation.