Skip to content

The job scheduler

Learning outcomes

  • Practice using the UPPMAX documentation
  • Can find his/her NAISS/UPPMAX projects
  • Can see the job queue
  • Can submit a job from the command line
  • Can submit a job using a script
  • Can cancel a job
For teachers

Teaching goals are:

  • Learners have practiced using the UPPMAX documentation
  • Learners can find their NAISS/UPPMAX projects
  • Learners can see the job queue
  • Learners can submit a job from the command line
  • Learners can submit a job using a script
  • Learners can cancel a job

Lesson plan:

gantt
  title The job scheduler
  dateFormat X
  axisFormat %s
  section First hour
  Prior : prior, 0, 5s
  Present: present, after prior, 2s
  %% It took me 17 mins, here I do that time x2
  Challenge: crit, challenge, after present, 34s
  %% Here I use the same time it took me to give feedback
  Feedback: feedback, after challenge, 17s

Prior questions:

  • What is a job?
  • What is a job scheduler?
  • Why does one need a job scheduler?
  • What information may one need to tell a job scheduler?
  • Is it possible to have all nodes of Rackham running your jobs?

Why?

You want to do calculations that take a long time and use much CPU power. To do so, one needs to schedule these jobs!

This is a short introduction in how to reach the calculation nodes Wednesday afternoon is wedded to this topic!

Using the job scheduler

The job scheduler has multiple programs, we use a minimal set of these three:

flowchart TD
  sbatch[sbatch: submit a job]
  scancel[scancel: cancel a running job]
  squeue[squeue: view the job queue]
  sbatch --> |Oops| scancel
  sbatch --> |Verify| squeue

Exercises

Need a video?

Here is a video that shows the solution of these exercises

Exercise 1: see the job queue

Go to the UPPMAX documentation at https://docs.uppmax.uu.se, then answer these questions:

  • Find the page on squeue, the program to view the job queue
Answer

It can be found at https://docs.uppmax.uu.se/software/squeue/

  • View all jobs in the queue
Answer

View all jobs in the queue:

squeue
  • View all your jobs in the queue
Answer

View your jobs in the queue:

squeue -u $USER

You will probably see that you have zero jobs scheduled

Exercise 2: view my UPPMAX projects

Go to the UPPMAX documentation at https://docs.uppmax.uu.se, then answer these questions:

  • Find the UPPMAX documentation page about projects
Answer

It can be found at https://docs.uppmax.uu.se/getting_started/project/

  • Where does that page redirect you, to view your projects?
Answer

You are redirected to the SUPR NAISS page at https://supr.naiss.se/

  • View all your projects
Answer

Here is an example of a user's SUPR projects

Example SUPR projects

  • View the project of this course
Answer

Here is how it looks like:

The NAISS project of this course

Exercise 3: submit a minimal job with Slurm parameters in the command-line

Go to the UPPMAX documentation at https://docs.uppmax.uu.se, then answer these questions:

  • Create a minimal bash script that does something. It may or may not use a module. It does need a shebang (but go ahead and omit it to see which error occurs)!
Answer

A minimal bash script would be:

#!/bin/bash
echo "Hello"

But any valid bash script with the same first line will do.

  • Find the page on sbatch, the program to submit a job to the queue
Answer

It can be found at https://docs.uppmax.uu.se/software/sbatch/

  • Use sbatch to submit running your bash script to the queue
Answer

Submit your script to the job queue like this:

sbatch -A naiss2024-22-49 my_script.sh
How does that look like?

Your output will look similar to this:

[sven@rackham3 ~]$ sbatch -A naiss2024-22-49 my_script.sh
Submitted batch job 49309848

The number is your job number

I get an error: 'This does not look like a batch script'

Like stated at the start of this exercise, the bash script needs to have a shebang.

Running a script without a shebang such as this:

module load cowsay/3.03
cowsay hello

Will result in the following error:

[sven@rackham3 ~]$ sbatch -A naiss2024-22-49 my_script.sh
sbatch: error: This does not look like a batch script.  The first
sbatch: error: line must start with #! followed by the path to an interpreter.
sbatch: error: For instance: #!/bin/sh
  • Use squeue to confirm that your job is in the job queue. You may need to be fast to see it!
Answer

The easiest is:

squeue -u $USER

Because the job may finish very fast, a trick is to use a semicolon to run the two command directly after each other:

sbatch -A naiss2024-22-49 my_script.sh; squeue -u $USER

The output will be similar to:

[richel@rackham3 ~]$ sbatch -A naiss2024-22-49 my_script.sh; squeue -u $USER
Submitted batch job 49309860
             JOBID PARTITION     NAME     USER ST       TIME  NODES NODELIST(REASON)
          49309860      core my_scrip   richel PD       0:00      1 (None)

Exercise 4: submit a minimal job with Slurm parameters in the bash script

Go to the UPPMAX documentation at https://docs.uppmax.uu.se, then answer these questions:

  • Find the page on sbatch again
Answer

It can be found at https://docs.uppmax.uu.se/software/sbatch/

  • Modify your bash script in such a way that it can be submitted to the queue by sbatch my_script.sh, by putting the -A parameter in the script
Answer

Here is an example minimal script:

#!/bin/bash
#SBATCH -A uppmax2023-2-25
module load cowsay/3.03
cowsay hello

Exercise 5: cancel a job

Go to the UPPMAX documentation at https://docs.uppmax.uu.se, then answer these questions:

  • Find the page on scancel
Answer

It can be found at https://docs.uppmax.uu.se/software/scancel/

  • Schedule a job and cancel it
Answer

You output will be similar to this:

[sven@rackham3 ~]$ sbatch -A uppmax2023-2-25 my_script.sh 
Submitted batch job 49311056
[sven@rackham3 ~]$ scancel 49311056
[sven@rackham3 ~]$ 
Answer

For a job of that length, use -t 1-2:3:4:

[sven@rackham3 ~]$ sbatch -A uppmax2023-2-25 -t 1-2:3:4 my_script.sh 
Submitted batch job 49311056
[sven@rackham3 ~]$ scancel 49311056
[sven@rackham3 ~]$