Skip to content

Using Slurm on Pelle

This page describes how to use Slurm on Pelle.

What is Slurm?

See the general page about Slurm

What is Pelle?

See the general page about Pelle

See Slurm troubleshooting how to fix Slurm errors.

sbatch (and interactive) on Pelle

sbatch (and interactive) work the same as on the other clusters, the only difference is that some flags/options may be different, like partition name, see below.

Want to start an interactive session?

See how to start an interactive session on Pelle

Here it is shown how to submit a job with:

  • Command-line Slurm parameters
  • Slurm parameters in the script

Partitions on Pelle

Partition flag is either --partition or -p

Partition name Description
pelle (Default) Use one or more CPU cores
fat Use a fat node with 2 or 3 TB memory, see below
gpu GPU node, 2 types see below

The pelle partition

The pelle partition is default so you can omit specifying -p or --partition

Its allocates an ordinary CPU node (allows one to use one or more cores, up to 96 cores).

Here is the minimal use for one core:

sbatch -A [project_code] [script_filename]

For example:

sbatch -A staff my_script.sh

To specify multiple cores, use --ntasks (or -n) like this:

sbatch -A [project_code] --ntasks [number_of_cores] [script_filename]

For example:

sbatch -A staff --ntasks 2 my_script.sh

Here, two cores are used.

What is the relation between ntasks and number of cores?

Agreed, the flag ntasks only indicates the number of tasks. However, by default, the number of tasks per core is set to one. One can make this link explicit by using:

sbatch -A [project_code] --partition core --ntasks [number_of_cores] --ntasks-per-core 1 [script_filename]

This is especially important if you might adjust core usage of the job to be something less than a full node.

The fat partition

With the fat partition you reach compute nodes with more memory. There are at the moment just one 2 TB node and one 3 TB node.

  • To allocate 2 TB: -p fat -C 2TB

    • Example: interactive -A staff -t 1:0:0 -p fat -C 2TB
  • To allocate 3 TB: -p fat -C 3TB

    • Example: interactive -A staff -t 1:0:0 -p fat -C 3TB

The gpu partition

With the gpu partition you reach the nodes with GPUs.

There are two kinds of GPUs at the moment.

  • 4 of the lighter type L40s, enough for most problems. Each node has 10 (!) GPUs. Most often just one GPU is needed, so remember to state that you need just 1, see below.
  • 2 of the large type H100, which can be suitable for large training runs. Each node has 2 GPUs. Most often just one GPU is needed, so remember to state that you need just 1, see below.

Therefore, at first hand, allocate the default L40s and one of them

  • To allocate L40s: -p gpu --gres=gpu:<number of GPUs> or -p gpu --gpus:l40s:<number of GPUs>

    • Example with 1 GPU: interactive -A staff -t 1:0:0 -p gpu --gres=gpu:1
    • Example with 11 GPUs: interactive -A staff -t 1:0:0 -p gpu --gres=gpu:11 will fail because there are just 10 GPUs on one node!
  • To allocate H100: -p gpu --gpus=h100:<number of GPUs>

    • Example with 1 GPU: `interactive -A staff -t 1:0:0 -p gpu --gpus=h100:1
    • Example with 3 GPU: `interactive -A staff -t 1:0:0 -p gpu --gpus=h100:3 will fail because there are just 2 GPUs on one node!

sbatch a script with command-line Slurm parameters

The minimal command to use sbatch with command-line Slurm parameters is:

sbatch -A [project_code] [script_filename]

where [project_code] is the project code, and [script_filename] the name of a bash script, for example:

sbatch -A uppmax2023-2-25 my_script.sh
Forgot your Rackham project?

One can go to the SUPR NAISS pages to see one's projects,

Example of the Rackham project called 'UPPMAX 2023/2-25'

Example of the Rackham project called 'UPPMAX 2023/2-25'

On the SUPR NAISS pages, projects are called 'UPPMAX [year]/[month]-[day]', for example, 'UPPMAX 2023/2-25'. The UPPMAX project name, as to be used on Rackham, has a slightly different name: the account name to use on Rackham is uppmax[year]-[month]-[day], for example, uppmax2023-2-25

What is in the script file?

The script file my_script.sh is a minimal example script. Such a minimal example script could be:

#!/bin/bash
echo "Hello"

Again, what is shown here is a minimal use of sbatch.

sbatch a script with Slurm parameters in script

The minimal command to use sbatch with Slurm parameters in the script:

sbatch [script_filename]

where [script_filename] the name of a bash script, for example:

sbatch my_script.sh

The script must contain at least the following lines:

#SBATCH -A [project_code]

where [project_code] is the project code, for example:

#SBATCH -A uppmax2023-2-25
Forgot your Rackham project?

One can go to the SUPR NAISS pages to see one's projects,

Example of the Rackham project called 'UPPMAX 2023/2-25'

Example of the Rackham project called 'UPPMAX 2023/2-25'

On the SUPR NAISS pages, projects are called 'UPPMAX [year]/[month]-[day]', for example, 'UPPMAX 2023/2-25'. The UPPMAX project name, as to be used on Rackham, has a slightly different name: the account name to use on Rackham is uppmax[year]-[month]-[day], for example, uppmax2023-2-25

A full example script would be:

#!/bin/bash
#SBATCH -A uppmax2023-2-25
echo "Hello"

Again, what is shown here is a minimal use of sbatch.