Please watch the Introduction to Slurm Tutorials - https://slurm.schedmd.com/tutorials.html
Running Your Jobs
(Quick start) Starting an interactive session using srun
To get started quickly, from a login node:
...
For more options and examples on how to use srun to run an interactive job, see https://slurm.schedmd.com/srun.html
Starting an independent interactive session using salloc
This command allocates a node, or collection of nodes, for your use. Basic usage:
...
indicating you are now on c002. You can now run jobs as you normally would. So why use salloc? The use of an ssh connection allows you to perform port forwarding, which is useful if you want to use jupyter notebooks.
(Recommended) Starting a batch job using squeue
Interactive jobs are good for testing and development, but production type jobs should be submitted using the squeue command to submit a run script. This will put your job in the scheduler and will automatically start it as soon as it can.
...
to get real-time output of your code by tracking the output file.
INCLINE Partitions
Status | Partition Name | Access | Resources | Max # nodes | Max time | Current Priority Job Factor (higher number = higher priority) | Description |
---|---|---|---|---|---|---|---|
Enabled | compute | All | Compute nodes | 26 | 24h | 2 | This is the standard workhorse category of partitions for most HPC codes. Jobs submitted to these queues are reasonably high priority, but have a 24 hour time limit. |
Enabled | gpu | All | GPU nodes | 2 | 24h | 2 | |
Enabled | bigmem | All | High memory nodes | 2 | 24h | 2 | |
Disabled | compute-quick | All | Compute nodes | 2 | 1h | 3 | These partitions are for testing or debugging. Submitting to these queues gets your code running quickly. |
Disabled | gpu-quick | All | GPU nodes | 1 | 1h | 3 | |
Disabled | bigmem-quick | All | High memory nodes | 1 | 1h | 3 | |
Disabled | compute-long | All | Compute nodes | 13 | 720h | 1 | Use these partitions for long-time jobs that are expected to take multiple days or even weeks. These partitions are low priority but have a long runtime. |
Disabled | gpu-long | All | GPU nodes | 1 | 720h | 1 | |
Disabled | bigmem-long | All | High memory nodes | 1 | 720h | 1 | |
Disabled | compute-unlimited | Privileged | Compute nodes only | 26 | Unlimited | 100 | These partitions are high-priority, unlimited queues accessible to privileged users only. Use of these queues is available by special request only. The unlimited queues are used for ultra-large-scale production runs, benchmarking tests, etc. |
Disabled | gpu-unlimited | Privileged | GPU nodes only | 2 | Unlimited | 100 | |
Disabled | bigmem-unlimited | Privileged | High memory nodes only | 2 | Unlimited | 100 | |
Disabled | compute-USER | USER | Compute nodes | N | Unlimited | 100 | These partitions are for users who are the owners of individual nodes on INCLINE. For instance, if bobsmith is a PI who has paid to purchase a compute node, then compute-bobsmith is a special high-priority queue accessible to him and his designated users only. |
Disabled | gpu-USER | USER | GPU nodes | N | Unlimited | 100 | |
Disabled | bibmem-USER | USER | High memory nodes | N | Unlimited | 100 |
For detailed discussion of SLURM prioritization and fairshare algorithm, see this presentation.