Welcome to “Using Python in an HPC environment” course material

This material

Here you will find the content of the workshop Using Python in an HPC environment.


  • This course aims to give a brief, but comprehensive introduction to using Python in an HPC environment.

  • You will learn how to
    • use modules to load Python

    • find site installed Python packages

    • install packages yourself

    • use virtual environments,

    • write a batch script for running Python

    • use Python in parallel

    • use Python for ML and on GPUs.

  • This course will consist of lectures interspersed with hands-on sessions where you get to try out what you have just learned.

Not covered

  • Improve python coding skills

  • Specifics of other clusters

  • We aim to give this course in spring and fall every year.

Your expectations?

  • Find best practices for using Python at an HPC centre

  • Learn how to use and install packages

  • Use the HPC capabilities of Python

Not covered

  • Improve python coding skills

  • Specifics of other clusters


Target group

  • The course is for present or presumptive users at UPPMAX or HPC2N or possibly other clusters in Sweden.

  • Therefore we apply python solutions on both clusters, so a broad audience can benefit.

  • We also provide links to the Python/Jupyter documentation at other Swedish HPC centres with personell affiliated to NAISS.

Cluster-specific approaches

  • The course is a cooperation between UPPMAX (Rackham, Snowy, Bianca) and HPC2N (Kebnekaise). The main focus will be on UPPMAX’s systems, but Kebnekaise will be included as well. If you already have an account at Kebnekaise, you can use that system for the hands-ons.

  • In most cases there is little or no difference between UPPMAX’s systems and HPC2N’s systems (and the other HPC systems in Sweden), except naming of modules and such. We will mention (and cover briefly) instances when there are larger differences.

  • See further below a short introduction to the centre-specific cluster architectures of UPPMAX and HPC2N.

How is the workshop run?

  • General sessions with small differences shown for UPPMAX and HPC2N in tabs

  • Main focus on the NAISS resources at UPPMAX, but Kebnekaise specifics will be covered

  • Users who already have accounts/projects at HPC2N/Kebnekaise are welcome to use that for the exercises. UPPMAX/Rackham will be used for everyone else.


Some practicals


  • You should have gotten an email with the links

  • Main room for lectures (recorded)

  • Breakout rooms
    • exercises, including a silent room for those who just want to work on their own without interruptions.

    • help

  • The lectures and demos will be recorded, but NOT the exercises.
    • If you ask questions during the lectures, you may thus be recorded.

    • If you do not wish to be recorded, then please keep your microphone muted and your camera off during lectures and write your questions in the Q/A document (see more information below about the collaboration documents which are also listed above).

  • Use your REAL NAME.

  • Please MUTE your microphone when you are not speaking

  • Use the “Raise hand” functionality under the “Participants” window during the lecture.

  • Please do not clutter the Zoom chat.

  • Behave politely!

Q/A collabration document

The two HPC centers UPPMAX and HPC2N

Two HPC centers

  • There are many similarities:

    • Login vs. calculation/compute nodes

    • Environmental module system with software hidden until loaded with module load

    • Slurm batch job and scheduling system

    • pip install procedure

  • … and small differences:

    • commands to load Python, Python packages, R, Julia

    • slightly different flags to Slurm

  • … and some bigger differences:

    • UPPMAX has three different clusters

      • Rackham for general purpose computing on CPUs only

      • Snowy available for local projects and suits long jobs (< 1 month) and has GPUs

      • Bianca for sensitive data and has GPUs

    • HPC2N has Kebnekaise with GPUs

    • Conda is recommended only for UPPMAX users


Briefly about the cluster hardware and system at UPPMAX and HPC2N

What is a cluster?

  • Login nodes and calculations/computation nodes

  • A network of computers, each computer working as a node.

  • Each node contains several processor cores and RAM and a local disk called scratch.

  • The user logs in to login nodes via Internet through ssh or Thinlinc.

    • Here the file management and lighter data analysis can be performed.

  • The calculation nodes have to be used for intense computing.

Overview of the UPPMAX systems

graph TB

  Node1 -- interactive --> SubGraph2Flow
  Node1 -- sbatch --> SubGraph2Flow
  subgraph "Snowy"
  SubGraph2Flow(calculation nodes) 

        thinlinc -- usr-sensXXX + 2FA + VPN ----> SubGraph1Flow
        terminal -- usr --> Node1
        terminal -- usr-sensXXX + 2FA + VPN ----> SubGraph1Flow
        Node1 -- usr-sensXXX + 2FA + no VPN ----> SubGraph1Flow
        subgraph "Bianca"
        SubGraph1Flow(Bianca login) -- usr+passwd --> private(private cluster)
        private -- interactive --> calcB(calculation nodes)
        private -- sbatch --> calcB

        subgraph "Rackham"
        Node1[Login] -- interactive --> Node2[calculation nodes]
        Node1 -- sbatch --> Node2

Preliminary schedule

Preliminary schedule





Syllabus and the clusters


Introduction to Python



Loading and running Python and using installed packages

Lecture + type-along




Installing packages and isolated environments

Lecture + type-along


SLURM Batch scripts and arraysfor Python jobs

Lecture + type-along + exercise


Short leg stretch



Lecture + type-along


Catch-up time and Q/A (no recording)





Parallelising simple Python codes

Lecture + type-along + exercise


Short leg stretch


Using GPU:s for Python

Lecture + type-along + exercise




Using Python for Machine Learning jobs

Lecture + type-along + exercise


Summary + Evaluation


Exercises and Q&A on-demand



Prepare your environment now!


  • The course project on UPPMAX (Rackham) is: naiss2024-22-415

  • If you work on Kebnekaise you may use existing projects you have already. The CPU-hrs used in this course is probably negligable.

  • Rackham: ssh <user>@rackham.uppmax.uu.se

  • Rackham through ThinLinc,

    • use the App with
      • address: rackham-gui.uppmax.uu.se NB: leave out the https://www.!

      • user: <username-at-uppmax> NB: leave out the https://www.!

    • or go to <https://rackham-gui.uppmax.uu.se>

      • here, you’ll need two factor authentication.

  • Create a working directory where you can code along. We recommend creating it under the course project storage directory

  • Example. If your username is “mrspock” and you are at UPPMAX, then we recommend you to create a user folder in the project folder of the course and step into that:

    • cd /proj/hpc-python

    • mkdir mrspock

    • cd mrspock


  • Stay in the folder you just created above!

  • You can download the exercises from the course GitHub repo, under the “Exercises” directory or clone the whole repo!

  • Get an overview here: https://github.com/UPPMAX/HPC-python/tree/main/Exercises


  • On HPC2N, you can copy the exercises in a tarball cp /proj/nobackup/python-hpc/exercises.tar.gz .

  • On UPPMAX you can copy the exercises in a tarball from cp /proj/hpc-python/exercises.tar.gz .

  • Untar it: tar xzvf exercises.tar.gz

Clone the git directory

  • git clone https://github.com/UPPMAX/HPC-python.git

NOTE If you are downloading / cloning from the course GitHub repo and into the above directory, your Python examples and batch submit file examples will be in a subdirectory of that.

Assuming you created a directory MYDIR-NAME under the project storage, you will find the examples as follows:

Python programs


Batch submit files


Use Thinlinc or terminal?

  • It is up to you!

  • Graphics come easier with Thinlinc, so recomended in the early session when we will plot a figure.

  • For this course, when having many windows open, it may be better to run in terminal in most of the cases, for space issues.

Content of the course