Partitions¶

Your SLURM job can be submitted into a specific partition which defines not only the specific hardware (such as GPU nodes) for the job, but also constrains other job parameters (maximum job size, time limits and priorities).

For example, the default partition is called "short" and the jobs submitted to it can consume up to 8 generic nodes (or 512 cores) for 24 hours. If you need a production access to the GPU nodes, you need to assing your job to the gpu partition, where you can use up to 64 cores and 4 NVidia A100 cards for two days. The purpose of testing partition is to allow short-time access to the resources for development and testing purposes. This should be helpful for developers in situations when the cluster is fully utilized.

If your job requirements don't match the limits set for the available partitions, contact us via our helpdesk.

To select a given partition with a [Slurm command], use the -p <partition> option:

srun|srun|salloc|sinfo|squeue... -p <partition> [...]

List of Partitions and Their Parameters¶

Partition	Nodes	Time limit (d-hh:mm)	Job size limit (nodes/cores)	GPUs	Priority factor
`testing`	login01,login02	0-00:30	1/16	1	0
`gpu`	n141-n148	2-00:00	1/64	4	0
`short`	n001-n140	1-00:00	8/512	0	2
`medium`	n001-n140	2-00:00	4/256	0	1
`long`	n001-n140	4-00:00	1/64	0	0

Partition State Information¶

For detailed about all available partitions and their definition/limits:

login01:~$ scontrol show partitions <name>

Long partition information

login01:~$ scontrol show partitions long
  ParrtitionName=long
    AllowGroups=ALL AllowAccounts=ALL AllowQos=ALL
    AllocNodes=ALL Default=NO QoS=N/A
    DefaultTime=4-00:00:00 DisableRootJobs=NO ExclusiveUser=NO GraceTime=0 Hidden=NO
    MaxNodes=1 MaxTime=4-00:00:00 MinNodes=0 LLN=NO MaxCPUsPerNode=UNLIMITED
    Nodes=n[001-140]
    PriorityJobFactor=0 PriorityTier=1 RootOnly=NO ReqResv=NO OverSubscribe=NO
    OverTimeLimit=NONE PreemptMode=OFF
    State=UP TotalCPUs=8960 TotalNodes=140 SelectTypeParameters=NONE
    JobDefaults=(null)
    DefMemPerCPU=4000 MaxMemPerNode=UNLIMITED
    TRES=cpu=8960,mem=35000G,node=140,billing=8960
    TRESBillingWeights=CPU=1.0,Mem=0.256G

Partition Limits¶

At partition level, only the following limits can be enforced:

DefaultTime: Default time limit
MaxNodes: Maximum number of nodes per job
MinNodes: Minimum number of nodes per job
MaxCPUsPerNode: Maximum number of CPUs job can be allocated on any node
MaxMemPerCPU/Node: Maximum memory job can be allocated on any CPU or node
MaxTime: Maximum length of time user's job can run

Created by: Marek Štekláč