Introduction
- Welcome page and syllabus
Also link at House symbol 🏠 at top of page
Learning outcomes
Load Python modules and site-installed Python packages
Create a virtual environment
Install Python packages with pip (Kebnekaise, Rackham, Snowy, Cosmos)
Write a batch script for running Python
Use the compute nodes interactively
Use Python in parallel
Use Python for ML
Use GPUs with Python
What is python?
As you probably already know…
“Python combines remarkable power with very clear syntax.
It has modules, classes, exceptions, very high level dynamic data types, and dynamic typing.
There are interfaces to many system calls and libraries, as well as to various windowing systems. …“
In particular, what sets Python apart from other languages is its fantastic open-source ecosystem for scientific computing and machine learning with libraries like NumPy, SciPy, scikit-learn and Pytorch.
The youtube video Thinking about Concurrency is a good introduction to writing concurrent programs in Python
The book High Performance Python is a good resource for ways of speeding up Python code.
Material for improving your programming skills
First level
The Carpentries teaches basic lab skills for research computing.
General introduction to Python by UPPMAX at https://www.uu.se/en/centre/uppmax/study/courses-and-workshops/introduction-to-uppmax
Second level
Other course/workhops given by NAISS HPC centres:
CodeRefinery develops and maintains training material on software best practices for researchers that already write code. Their material addresses all academic disciplines and tries to be as programming language-independent as possible.
Aalto Scientific Computing
Third level
ENCCS (EuroCC National Competence Centre Sweden) is a national centre that supports industry, public administration and academia accessing and using European supercomputers. They give higher-level training of programming and specific software.
The youtube video Thinking about Concurrency is a good introduction to writing concurrent programs in Python
The book High Performance Python is a good resource for ways of speeding up Python code.
Documentations at other NAISS centres
Important
Project ID and storage directory
- UPPMAX:
Project ID: naiss2024-22-1442
Storage directory: /proj/hpc-python-fall
- HPC2N:
Project ID: hpc2n2024-142
Storage directory: /proj/nobackup/hpc-python-fall-hpc2n
- LUNARC:
Project ID: lu2024-2-88
Storage directory: /lunarc/nobackup/projects/lu2024-17-44
- NSC:
Project ID: naiss2024-22-1493
Storage directory: /proj/hpc-python-fall-nsc
Login to the center you have an account at, go to the storage directory, and create a directory below it for you to work in. You can call this directory what you want, but your username is a good option.
Important
Course material
- You can get the course material, including exercises, from the course repository on GitHub. You can either (on of these):
Clone it:
git clone https://github.com/UPPMAX/HPC-python.git
- Download the zip file and unzip it:
wget https://github.com/UPPMAX/HPC-python/archive/refs/heads/main.zip
unzip main.zip
You should do either of the above from your space under the course directory on the HPC center of your choice.
Objectives
We will:
teach you how to navigate the module system at HPC2N, UPPMAX, LUNARC, and NSC
show you how to find out which versions of Python and packages are installed
look at the package handler pip
explain how to create and use virtual environments
show you how to run batch jobs
show some examples with parallel computing and using GPUs
guide you in how to start Python tools for Machine Learning