This is a hyper-condensed summary of Slurm basics. If you haven’t already, we highly recommend you go through Part 1 and Part 2 of the Slurm tutorial (otherwise, everything below won’t be very useful to you).

1 - Slurm Commands

Purpose Example Notes
View Slurm Accounts $ my-accounts  
View Recent Job Details $ sacct  
Submit Batch Job $ sbatch <sbatch_script.sh>  
Submit Dependent Batch Job $ sbatch --dependency=afterok:<job_id>  
Cancel Specific Job $ scancel <job_id>  
Cancel All of Your Jobs $ scancel --me  
View Specific Job Details $ scontrol show job <job_id>  
View Cluster Details $ sinfo  
Submit Interactive Job $ sinteractive  
Submit Interactive Job w/ GPU $ sinteractive --gres=gpu:<type>:<number>  
See All Jobs in Queue $ squeue  
See Only Your Jobs $ squeue --me  
See When Your Jobs Will Start $ squeue --me --start Worst case
See Upcoming Maintenance Windows $ time-until-maintenance  
View Job Priority Information $ sprio -u <rit_username>  
Parallel Computation srun <options> <command> Inside sbatch script
Heterogeneous Group srun --pack-group <number> Inside sbatch script

2 - Slurm Configuration Options

Option Example Notes
Job Name #SBATCH --job-name=<job_name>  
Comment #SBATCH --comment=<comment>  
Account #SBATCH --account=<account_name>  
Partition #SBATCH --partition=<debug,tier3> Debug is only for debug
Time Limit #SBATCH --time=D-HH:MM:SS  
Output File #SBATCH --output=<optional_dir>/%x_%j.out Create folder before submitting
Error File #SBATCH --error=<optional_dir>/%x_%j.err Create folder before submitting
Slack Username #SBATCH --mail-user=slack:@<abc1234  
Notification Type #SBATCH --mail-type=<BEGIN,END,FAIL,ALL>  
Number of Nodes #SBATCH --nodes=<num_nodes>  
Excluding Nodes #SBATCH --exclude=<node1,node2,...> Do not run on these nodes
Exclusive Node Access #SBATCH --exclusive Use with --mem=0
Number of Tasks #SBATCH --ntasks=<num_tasks> i.e. processes; Default=1
Tasks per Node #SBATCH --ntasks-per-node=<num_tasks> Default=1
CPUs per Task #SBATCH --cpus-per-task=<num_cpus> Default=1
Request Specific GPUs #SBATCH --gres=gpu:<type>:<number> Use what you request
Request Generic GPUs #SBATCH --gres=gpu:<number> Use what you request
GPUs per Task #SBATCH --gpus-per-task=<type>:<number> Use what you request
Memory per Node #SBATCH --mem=<number><k,m,g,t> Default is MB; specify units
Memory per CPU #SBATCH --mem-per-cpu=<number><k,m,g,t> Default is MB; specify units
All Memory on Node #SBATCH --mem=0 Use with --exclusive
Job Array Size #SBATCH --array=<value>-<value>  
Heterogeneous Separator #SBATCH hetjob  

3 - Slurm Environment Variables

Value Variable Notes
Job ID $SLURM_JOB_ID Set automatically on submission
Job Name $SLURM_JOB_NAME Set by --job-name
Number of Tasks $SLURM_NTASKS Set by --ntasks
Number of Tasks per Node $SLURM_NTASKS_PER_NODE Set by --ntasks-per-node
Number of CPUs per Task $SLURM_CPUS_PER_TASK Set by --cpus-per-task
Memory per CPU $SLURM_MEM_PER_CPU Set by --mem-per-cpu
Memory per Node $SLURM_MEM_PER_NODE Set by --mem
Job Array Task ID $SLURM_ARRAY_TASK_ID Set automatically on submission

Heterogeneous groups use the same variables with _PACK_GROUP_<number> appended.

Full list here. Do not use Slurm Input Environment Variables.

4 - External Resources



Tags: slurm