This is a hyper-condensed summary of Slurm basics. If you haven’t already, we highly recommend you go through Part 1 and Part 2 of the Slurm tutorial (otherwise, everything below won’t be very useful to you).
1 - Slurm Commands
| Purpose | Example | Notes |
|---|---|---|
| View Slurm Accounts | $ my-accounts |
|
| View Recent Job Details | $ sacct |
|
| Submit Batch Job | $ sbatch <sbatch_script.sh> |
|
| Submit Dependent Batch Job | $ sbatch --dependency=afterok:<job_id> |
|
| Cancel Specific Job | $ scancel <job_id> |
|
| Cancel All of Your Jobs | $ scancel --me |
|
| View Specific Job Details | $ scontrol show job <job_id> |
|
| View Cluster Details | $ sinfo |
|
| Submit Interactive Job | $ sinteractive |
|
| Submit Interactive Job w/ GPU | $ sinteractive --gres=gpu:<type>:<number> |
|
| See All Jobs in Queue | $ squeue |
|
| See Only Your Jobs | $ squeue --me |
|
| See When Your Jobs Will Start | $ squeue --me --start |
Worst case |
| See Upcoming Maintenance Windows | $ time-until-maintenance |
|
| View Job Priority Information | $ sprio -u <rit_username> |
|
| Parallel Computation | srun <options> <command> |
Inside sbatch script |
| Heterogeneous Group | srun --pack-group <number> |
Inside sbatch script |
2 - Slurm Configuration Options
| Option | Example | Notes |
|---|---|---|
| Job Name | #SBATCH --job-name=<job_name> |
|
| Comment | #SBATCH --comment=<comment> |
|
| Account | #SBATCH --account=<account_name> |
|
| Partition | #SBATCH --partition=<debug,tier3> |
Debug is only for debug |
| Time Limit | #SBATCH --time=D-HH:MM:SS |
|
| Output File | #SBATCH --output=<optional_dir>/%x_%j.out |
Create folder before submitting |
| Error File | #SBATCH --error=<optional_dir>/%x_%j.err |
Create folder before submitting |
| Slack Username | #SBATCH --mail-user=slack:@<abc1234 |
|
| Notification Type | #SBATCH --mail-type=<BEGIN,END,FAIL,ALL> |
|
| Number of Nodes | #SBATCH --nodes=<num_nodes> |
|
| Excluding Nodes | #SBATCH --exclude=<node1,node2,...> |
Do not run on these nodes |
| Exclusive Node Access | #SBATCH --exclusive |
Use with --mem=0 |
| Number of Tasks | #SBATCH --ntasks=<num_tasks> |
i.e. processes; Default=1 |
| Tasks per Node | #SBATCH --ntasks-per-node=<num_tasks> |
Default=1 |
| CPUs per Task | #SBATCH --cpus-per-task=<num_cpus> |
Default=1 |
| Request Specific GPUs | #SBATCH --gres=gpu:<type>:<number> |
Use what you request |
| Request Generic GPUs | #SBATCH --gres=gpu:<number> |
Use what you request |
| GPUs per Task | #SBATCH --gpus-per-task=<type>:<number> |
Use what you request |
| Memory per Node | #SBATCH --mem=<number><k,m,g,t> |
Default is MB; specify units |
| Memory per CPU | #SBATCH --mem-per-cpu=<number><k,m,g,t> |
Default is MB; specify units |
| All Memory on Node | #SBATCH --mem=0 |
Use with --exclusive |
| Job Array Size | #SBATCH --array=<value>-<value> |
|
| Heterogeneous Separator | #SBATCH hetjob |
3 - Slurm Environment Variables
| Value | Variable | Notes |
|---|---|---|
| Job ID | $SLURM_JOB_ID |
Set automatically on submission |
| Job Name | $SLURM_JOB_NAME |
Set by --job-name |
| Number of Tasks | $SLURM_NTASKS |
Set by --ntasks |
| Number of Tasks per Node | $SLURM_NTASKS_PER_NODE |
Set by --ntasks-per-node |
| Number of CPUs per Task | $SLURM_CPUS_PER_TASK |
Set by --cpus-per-task |
| Memory per CPU | $SLURM_MEM_PER_CPU |
Set by --mem-per-cpu |
| Memory per Node | $SLURM_MEM_PER_NODE |
Set by --mem |
| Job Array Task ID | $SLURM_ARRAY_TASK_ID |
Set automatically on submission |
Heterogeneous groups use the same variables with _PACK_GROUP_<number> appended.
Full list here. Do not use Slurm Input Environment Variables.
4 - External Resources
Help: If there are any further questions, or there is an issue with the documentation, please submit a ticket or contact us on Slack for additional assistance.