Slurm¶
The UPPMAX clusters are a shared resource. To ensure fair use, UPPMAX uses a scheduling system. A scheduling system decides at what time which calculation is done. The software used is called Slurm.
Why not write SLURM?
Indeed, Slurm started as an abbreviation of 'Simple Linux Utility for Resource Management'. However, the Slurm homepage uses 'Slurm' to describe the tool, hence we use Slurm too.
This page describes how to use Slurm in general. See optimizing jobs how to optimize Slurm jobs. See Slurm troubleshooting how to fix Slurm errors.
For information specific to clusters, see:
Slurm Commands¶
The Slurm system is accessed using the following commands:
interactive
- Start an interactive session. This is described in-depth for Bianca and Rackhamsbatch
- Submit and run a batch job scriptsrun
- Typically used inside batch job scripts for running parallel jobs (See examples further down)scancel
- Cancel one or more of your jobs.sinfo
: view information about Slurm nodes and partitions
flowchart TD
login_node(User on login node)
interactive_node(User on interactive node)
computation_node(Computation node):::calculation_node
login_node --> |move user, interactive|interactive_node
login_node ==> |submit jobs, sbatch|computation_node
computation_node -.-> |can become| interactive_node
The different types of nodes an UPPMAX cluster has. The thick edge shows the topic of this page: how to submit jobs to a computation node.
Job parameters¶
This session describes how to specify a Slurm job:
- Getting started redirects to the cluster-specific pages
- Partitions specify the type of job
Getting started¶
To let Slurm schedule a job, one uses sbatch
, like:
for example:
Minimal and complete examples of using sbatch
is described at the respective cluster guides:
Specify duration of the run¶
To let Slurm schedule a job with a certain, one uses sbatch
, like:
for example, for a job of 1 day, 23 hours, 59 minutes and 0 seconds:
If the job takes too long, this will result in a timeout error and the job will be aborted.
The maximum duration of the run depends on the cluster you use.
Partitions¶
Partitions are a way to tell what type of job you are submitting, e.g. if it needs to reserve a whole node, or part of a node.
To let Slurm schedule a job using a partition,
use the --partition
(or -p
) flag like this:
for example:
These are the partition names and their descriptions:
Partition name | Description |
---|---|
core |
Use one or more cores |
node |
Use a full node's set of cores |
devel |
Development job |
devcore |
Development job |
The core
partition¶
The core
partition allows one to use one or more cores.
Here is the minimal use for one core:
For example:
To specify multiple cores, use --ntasks
(or -n
) like this:
For example:
Here, two cores are used.
What is the relation between ntasks
and number of cores?
Agreed, the flag ntasks
only indicates the number of threads.
However, by default, the number of tasks per core is set to one.
One can make this link explicit by using:
This is especially important if you might adjust core usage of the job to be something less than a full node.
The node
partition¶
Whenever -p node is specified, an entire node is used, no matter how many cores are specifically requested with -n [no_of_cores].
For example, some bioinformatics tools show minimal increase in performance when more than 8-10 cores/job; in this case, specify "-p core -n 8" to ensure that only 8 cores (less than a single node) are allocated for such a job.
The devel
partition¶
The devcore
partition¶
Specifying job parameters¶
Whether you use the UPPMAX clusters interactively or in batch mode, you always have to specify a few things, like number of cores needed, running time etc. These things can be specified in two ways:
Either as flags sent to the different Slurm commands (sbatch
, srun
, the
interactive
command, etc.), like so:
or, when using the sbatch
command, it can be specified inside the job script
file itself, by using special SBATCH
comments, for example:
#!/bin/bash -l
#SBATCH -A p2012999
#SBATCH -p core
#SBATCH -n 1
#SBATCH -t 12:00:00
#SBATCH -J some_job_name
If doing this, then one will only need to start the script like so, without any flags:
How to see how many resources my project has used?
Use projplot.
Need more resources or GPU?¶
More memory¶
If you need extra memory (128 GB is available in common nodes) you can allocate larger nodes. The number and sizes differ among the clusters.
Table below shows the configurations and flags to use.
RAM | Rackham | Snowy | Bianca |
---|---|---|---|
256 GB | -C mem256GB |
-C mem256GB |
-C mem256GB |
512 GB | N/A | -C mem512GB |
-C mem512GB |
1 TB | -C mem1TB |
N/A | N/A |
2 TB | N/A | -p veryfat -C mem2TB |
N/A |
4 TB | N/A | -p veryfat -C mem4TB |
N/A |
GPUs¶
- Bianca: Nodes with Nvidia A100 40 GB
- All GPU nodes have at least 256 GB RAM (fat nodes) with 16 CPU cores and 2 GPUs per node
- Snowy: Nodes with Tesla T4 16 GB
- The GPU nodes have either 128 or 256 GB memory and one GPU per node
Slurm options:
- Snowy 128 GB:
-M snowy -p node --gres=gpu:1 -t 1:0:1
(Please note that -t has to be more than 1 hr) - Snowy 256 GB:
-M snowy -p node -C mem256GB --gres=gpu:1 -t 1:0:1
-
Bianca:
-C gpu --gres=gpu:1 -t 01:10:00