Welcome to “Using Python in an HPC environment” course material

This material

Here you will find the content of the workshop Using Python in an HPC environment.

Content

  • This course aims to give a brief, but comprehensive introduction to using Python in an HPC environment.

  • You will learn how to
    • use modules to load Python

    • find site installed Python packages

    • install packages yourself

    • use virtual environments,

    • write a batch script for running Python

    • use Python in parallel

    • use Python for ML and on GPUs.

  • This course will consist of lectures interspersed with hands-on sessions where you get to try out what you have just learned.

  • We aim to give this course in spring and fall every year.

Your expectations?

  • Find best practices for using Python at an HPC centre

  • Learn how to use and install packages

  • Use the HPC capabilities of Python

Not covered

  • Improve python coding skills

  • Specifics of other clusters

Warning

Target group

  • The course is for present or presumptive users at UPPMAX or HPC2N or possibly other clusters in Sweden.

  • Therefore we apply python solutions on both clusters, so a broad audience can benefit.

  • We also provide links to the Python/Jupyter documentation at other Swedish HPC centres with personell affiliated to NAISS.

Cluster-specific approaches

  • The course is a cooperation between UPPMAX (Rackham, Snowy, Bianca) and HPC2N (Kebnekaise). The main focus will be on UPPMAX’s systems, but Kebnekaise will be included as well. If you already have an account at Kebnekaise, you can use that system for the hands-ons.

  • In most cases there is little or no difference between UPPMAX’s systems and HPC2N’s systems (and the other HPC systems in Sweden), except naming of modules and such. We will mention (and cover briefly) instances when there are larger differences.

  • See further below a short introduction to the centre-specific cluster architectures of UPPMAX and HPC2N.

How is the workshop run?

  • General sessions with small differences shown for UPPMAX and HPC2N in tabs

  • Main focus on the NAISS resources at UPPMAX, but Kebnekaise specifics will be covered

  • Users who already have accounts/projects at HPC2N/Kebnekaise are welcome to use that for the exercises. UPPMAX/Rackham will be used for everyone else.

Prerequisites

Some practicals

Zoom

  • You should have gotten an email with the links

  • Main room for lectures (recorded)

  • Breakout rooms
    • exercises, including a silent room for those who just want to work on their own without interruptions.

    • help

  • The lectures and demos will be recorded, but NOT the exercises.
    • If you ask questions during the lectures, you may thus be recorded.

    • If you do not wish to be recorded, then please keep your microphone muted and your camera off during lectures and write your questions in the Q/A document (see more information below about the collaboration documents which are also listed above).

  • Use your REAL NAME.

  • Please MUTE your microphone when you are not speaking

  • Use the “Raise hand” functionality under the “Participants” window during the lecture.

  • Please do not clutter the Zoom chat.

  • Behave politely!

Q/A collabration document

Exercises

  • You can download the exercises from the course GitHub repo, under the “Exercises” directory: https://github.com/UPPMAX/HPC-python/tree/main/Exercises
    • On HPC2N, you can copy the exercises in a tarball from /proj/nobackup/support-hpc2n/bbrydsoe/examples.tar.gz

    • On UPPMAX you can copy the exercises in a tarball from /proj/naiss2023-22-1126/examples.tar.gz

Project

  • The course project on UPPMAX (Rackham) is: naiss2023-22-1126

  • If you work on Kebnekaise you may use existing projects you have already. The CPU-hrs used in this course is probably negligable.

The two HPC centers UPPMAX and HPC2N

Two HPC centers

  • There are many similarities:

    • Login vs. calculation/compute nodes

    • Environmental module system with software hidden until loaded with module load

    • Slurm batch job and scheduling system

    • pip install procedure

  • … and small differences:

    • commands to load Python, Python packages, R, Julia

    • slightly different flags to Slurm

  • … and some bigger differences:

    • UPPMAX has three different clusters

      • Rackham for general purpose computing on CPUs only

      • Snowy available for local projects and suits long jobs (< 1 month) and has GPUs

      • Bianca for sensitive data and has GPUs

    • HPC2N has Kebnekaise with GPUs

    • Conda is recommended only for UPPMAX users

Warning

Briefly about the cluster hardware and system at UPPMAX and HPC2N

What is a cluster?

  • Login nodes and calculations/computation nodes

  • A network of computers, each computer working as a node.

  • Each node contains several processor cores and RAM and a local disk called scratch.

_images/node.png
  • The user logs in to login nodes via Internet through ssh or Thinlinc.

    • Here the file management and lighter data analysis can be performed.

_images/nodes.png
  • The calculation nodes have to be used for intense computing.

Common features

  • Intel CPUs

  • Linux kernel

  • Bash shell

Hardware

Technology

Kebnekaise

Rackham

Snowy

Bianca

Cores/compute node

28 (72 for largemem part)

20

16

16

Memory/compute node

128-3072 GB

128-1024 GB

128-4096 GB

128-512 GB

GPU

NVidia V100, A100, old K80s

None

NVidia T4

NVidia A100

Overview of the UPPMAX systems

graph TB

  Node1 -- interactive --> SubGraph2Flow
  Node1 -- sbatch --> SubGraph2Flow
  subgraph "Snowy"
  SubGraph2Flow(calculation nodes) 
        end

        thinlinc -- usr-sensXXX + 2FA + VPN ----> SubGraph1Flow
        terminal -- usr --> Node1
        terminal -- usr-sensXXX + 2FA + VPN ----> SubGraph1Flow
        Node1 -- usr-sensXXX + 2FA + no VPN ----> SubGraph1Flow
        
        subgraph "Bianca"
        SubGraph1Flow(Bianca login) -- usr+passwd --> private(private cluster)
        private -- interactive --> calcB(calculation nodes)
        private -- sbatch --> calcB
        end

        subgraph "Rackham"
        Node1[Login] -- interactive --> Node2[calculation nodes]
        Node1 -- sbatch --> Node2
        end

Preliminary schedule

Preliminary schedule

Time

Topic

Activity

9:00

Syllabus and the clusters

9:20

Introduction to Python

Lecture

9:30

Loading and running Python and using installed packages

Lecture + type-along

9:55

Coffee

10:10

Installing packages and isolated environments

Lecture + type-along

10:35

SLURM Batch scripts and arraysfor Python jobs

Lecture + type-along + exercise

11:10

Short leg stretch

11:20

Interactive

Lecture + type-along

11:50

Catch-up time and Q/A (no recording)

Q/A

12:00

LUNCH

13:00

Parallelising simple Python codes

Lecture + type-along + exercise

13:40

Using GPU:s for Python

Lecture + type-along + exercise

14:20

Short leg stretch

14:30

Using Python for Machine Learning jobs

Lecture + type-along + exercise

15.10

Coffee

15.25

Summary

15.30

Exercises and Q&A on-demand

15.55

Evaluation info

16.00

END

Prepare your environment now!

  • Please log in to Rackham, Kebnekaise or other cluster that you are using.

  • For graphics, ThinLinc may be the best option.
  • Rackham: ssh <user>@rackham.uppmax.uu.se

  • Rackham through ThinLinc,

  • Create a working directory where you can code along. We recommend creating it under the course project storage directory

  • Example. If your username is “mrspock” and you are at UPPMAX, then we recommend you to create a user folder in the project folder of the course and step into that:

    • cd /proj/naiss2023-22-1126

    • mkdir mrspock

    • cd mrspock

  • Clone the course repo

    • git clone https://github.com/UPPMAX/HPC-python.git

NOTE If you are downloading / cloning from the course GitHub repo and into the above directory, your Python examples and batch submit file examples will be in a subdirectory of that.

Assuming you created a directory MYDIR-NAME under the project storage, you will find the examples as follows:

Python programs

/proj/naiss2023-22-1126/MYDIR-NAME/HPC-python/Exercises/examples/programs/

Batch submit files

/proj/naiss2023-22-1126/MYDIR-NAME/HPC-python/Exercises/examples/uppmax

Use Thinlinc or terminal?

  • It is up to you!

  • Graphics come easier with Thinlinc

  • For this course, when having many windows open, it may be better to run in terminal in most of the cases, for space issues.