Welcome to “Using Python in an HPC environment” course material

This material

Here you will find the content of the workshop Using Python in an HPC environment.

Content

  • This course aims to give a brief, but comprehensive introduction to using Python in an HPC environment.

  • You will learn how to
    • use modules to load Python

    • find site installed Python packages

    • install packages yourself

    • use virtual environments,

    • write a batch script for running Python

    • use Python in parallel

    • use Python for ML and on GPUs.

  • This course will consist of lectures interspersed with hands-on sessions where you get to try out what you have just learned.

Not covered

  • Improve python coding skills

  • Specifics of other clusters

  • We aim to give this course in spring and fall every year.

Your expectations?

  • Find best practices for using Python at an HPC centre

  • Learn how to use and install packages

  • Use the HPC capabilities of Python

Not covered

  • Improve python coding skills

  • Specifics of other clusters

Warning

Target group

  • The course is for present or presumptive users at UPPMAX, HPC2N, LUNARC, or possibly other clusters in Sweden.

  • Therefore we apply python solutions on all three clusters, so a broad audience can benefit.

  • We also provide links to the Python/Jupyter documentation at other Swedish HPC centres with personell affiliated to NAISS.

Cluster-specific approaches

  • The course is a cooperation between UPPMAX (Rackham, Snowy, Bianca), HPC2N (Kebnekaise), and LUNARC (Cosmos). The main focus will be on UPPMAX’s systems, but Kebnekaise and Cosmos will be included as well. If you already have an account at Kebnekaise or Cosmos, you can use those systems for the hands-ons.

  • In most cases there is little or no difference between UPPMAX’s systems, HPC2N’s systems, and LUNARC’s systems (and the other HPC systems in Sweden), except naming of modules and such. We will mention (and cover briefly) instances when there are larger differences.

  • See further below a short introduction to the centre-specific cluster architectures of UPPMAX, HPC2N, and LUNARC.

How is the workshop run?

  • General sessions with small differences shown for UPPMAX, HPC2N, and LUNARC in tabs

  • Main focus on the NAISS resources at UPPMAX, but Kebnekaise/Cosmos specifics will be covered

  • Users who already have accounts/projects at HPC2N (Kebnekaise) or LUNARC (Cosmos) are welcome to thoses systems for the exercises. UPPMAX/Rackham will be used for everyone else.

Prerequisites

Some practicals

Zoom

  • You should have gotten an email with the links

  • Main room for lectures (recorded)

  • Breakout rooms
    • exercises, including a silent room for those who just want to work on their own without interruptions.

    • help

  • The lectures and demos will be recorded, but NOT the exercises.
    • If you ask questions during the lectures, you may thus be recorded.

    • If you do not wish to be recorded, then please keep your microphone muted and your camera off during lectures and write your questions in the Q/A document (see more information below about the collaboration documents which are also listed above).

  • Use your REAL NAME.

  • Please MUTE your microphone when you are not speaking

  • Use the “Raise hand” functionality under the “Participants” window during the lecture.

  • Please do not clutter the Zoom chat.

  • Behave politely!

Q/A collabration document

The three HPC centers UPPMAX, HPC2N, and LUNARC

Three HPC centers

  • There are many similarities:

    • Login vs. calculation/compute nodes

    • Environmental module system with software hidden until loaded with module load

    • Slurm batch job and scheduling system

    • pip install procedure

  • … and small differences:

    • commands to load Python, Python packages

    • slightly different flags to Slurm

  • … and some bigger differences:

    • UPPMAX has three different clusters

      • Rackham for general purpose computing on CPUs only

      • Snowy available for local projects and suits long jobs (< 1 month) and has GPUs

      • Bianca for sensitive data and has GPUs

    • HPC2N has Kebnekaise with GPUs

    • LUNARC has two systems

      • Cosmos (CPUs and GPUs)

      • Cosmos-SENS (sensitive data)

    • Conda is recommended only for UPPMAX and LUNARC users

Warning

Briefly about the cluster hardware and system at UPPMAX, HPC2N, and LUNARC

What is a cluster?

  • Login nodes and calculations/computation nodes

  • A network of computers, each computer working as a node.

  • Each node contains several processor cores and RAM and a local disk called scratch.

_images/node.png
  • The user logs in to login nodes via Internet through ssh or Thinlinc.

    • Here the file management and lighter data analysis can be performed.

_images/cluster.png
  • The calculation/compute nodes have to be used for intense computing.

Overview of the UPPMAX systems

graph TB

  Node1 -- interactive --> SubGraph2Flow
  Node1 -- sbatch --> SubGraph2Flow
  subgraph "Snowy"
  SubGraph2Flow(calculation nodes) 
        end

        thinlinc -- usr-sensXXX + 2FA + VPN ----> SubGraph1Flow
        terminal -- usr --> Node1
        terminal -- usr-sensXXX + 2FA + VPN ----> SubGraph1Flow
        Node1 -- usr-sensXXX + 2FA + no VPN ----> SubGraph1Flow
        
        subgraph "Bianca"
        SubGraph1Flow(Bianca login) -- usr+passwd --> private(private cluster)
        private -- interactive --> calcB(calculation nodes)
        private -- sbatch --> calcB
        end

        subgraph "Rackham"
        Node1[Login] -- interactive --> Node2[calculation nodes]
        Node1 -- sbatch --> Node2
        end

Overview of HPC2N’s system

graph TB


        Terminal/ThinLinc -- usr --> Node1
        

        subgraph "Kebnekaise"
        Node1[Login] -- interactive --> Node2[compute nodes]
        Node1 -- sbatch --> Node2
        end

Overview of LUNARC’s systems

_images/cosmos-resources.png

Preliminary schedule

Preliminary schedule Thursday 5 December

Time

Topic

Content

Teacher(s)

9:00

Introduction to the course, log in, load/run Python, find packages

Getting started with practical things

All

9:55

Coffee

10:10

Install packages and isolated environments

Install, create and handle

Björn

11.00

Short leg stretch

11:10

Reaching compute nodes with Slurm

Batch jobs vs interactive work in IDEs

Birgitte

11:50

Catch-up time and Q/A (no recording)

12:00

LUNCH

13:00-14:30

Analysis with Python

Matplotlib, IDEs and plots from scripts

Rebecca

13.55

Short leg stretch 15m

14:30-15:30

Using GPUs for Python

Pedro?

14.50

Coffee 15 min

15:35

Use cases and Q/A

Bring your own problems

All

16.35

Summary + Evaluation

Preliminary schedule Friday 6 December

Time

Topic

Content

Teacher

9:00

Analysis with Python part I

Pandas

Rebecca

9:50

Coffee

10:05

Analysis with Python part II

Pandas & Seaborn

Rebecca

10.55

Short leg stretch

11:10

Parallelism part I: MPI, Processes, Dask

Processes, MPI

Pedro

12:00

LUNCH

13:00

Parallelism part II: MPI, Processes, Dask

Dask

Pedro

13:15

Big Data with Python

File formats and packages, Chunking

Björn

13:50

Short leg stretch

14:05

Machine and Deep Learning part I

Pytorch, Tensorflow, ScikitLearn

Jayant

14.55

Coffee

15:10

Machine and Deep Learning part II

Pytorch, Tensorflow, ScikitLearn

Jayant

15.50

Short leg stretch

16.00

Use cases and Q&A

Bring your own problems

All

16.45

Ending, with evaluation

Prepare your environment now!

  • Please log in to Rackham, Kebnekaise, Cosmos or other cluster that you are using.

  • For graphics, ThinLinc may be the best option.
  • Rackham has access for regular SSH, through a regular ThinLinc client and a through a web browser interface with ThinLinc:
  • Kebnekaise has access for regular SSH, ThinLinc clients, and through a web browser interface with ThinLinc:
  • Cosmos:
    • SSH: cosmos.lunarc.lu.se

    • ThinLinc: cosmos-dt.lunarc.lu.se

Project

  • The course project on UPPMAX (Rackham) is: naiss2024-22-1442

  • The course project on HPC2N (Kebnekaise) is: hpc2n2024-142

  • The course project on LUNARC (Cosmos) is: `` ``

  • Rackham: ssh <user>@rackham.uppmax.uu.se

  • Rackham through ThinLinc,

    • use the App with
      • address: rackham-gui.uppmax.uu.se NB: leave out the https://www.!

      • user: <username-at-uppmax> NB: leave out the https://www.!

    • or go to <https://rackham-gui.uppmax.uu.se>

      • here, you’ll need two factor authentication.

  • Create a working directory where you can code along. We recommend creating it under the course project storage directory

    • Example. If your username is “mrspock” and you are at UPPMAX, then we recommend you to create a user folder in the project folder of the course and step into that:

    • cd /proj/hpc-python-fall

    • mkdir mrspock

    • cd mrspock

Exercises

  • Stay in/go to the folder you just created above!

  • You can download the exercises from the course GitHub repo, under the “Exercises” directory or clone the whole repo!

    • Clone it with: git clone https://github.com/UPPMAX/HPC-python.git

    • Copy the tarball with wget https://github.com/UPPMAX/HPC-python/raw/refs/heads/main/exercises.tar.gz and then uncompress with tar -zxvf exercises.tar.gz

  • Get an overview here: https://github.com/UPPMAX/HPC-python/tree/main/Exercises

NOTE If you downladed the tarball and uncompressed it, the exercises are under exercises/ in the directory you picked. Under that you find Python scripts in programs and batch scripts in the directories named for the sites.

NOTE If you are downloading / cloning from the course GitHub repo and into the above directory, your Python examples and batch submit file examples will be in a subdirectory of that.

Assuming you created a directory MYDIR-NAME under the project storage, you will find the examples as follows:

Python programs

/proj/hpc-python-fall/MYDIR-NAME/HPC-python/Exercises/examples/programs/

Batch submit files

/proj/hpc-python-fall/MYDIR-NAME/HPC-python/Exercises/examples/uppmax

Use Thinlinc or terminal?

  • It is up to you!

  • Graphics come easier with Thinlinc, so recomended in the early session when we will plot a figure.

  • For this course, when having many windows open, it may be better to run in terminal in most of the cases, for space issues.

Content of the course