Introduction to running R, Python, Julia, and Matlab in HPC
Content
This course aims to give a brief, but comprehensive introduction to using Python, Julia, R, and Matlab in an HPC environment.
- You will learn how to
start the interpreters of Python, Julia, R, and Matlab by the HPC module system
find site installed packages/libraries
install packages/libraries yourself
use virtual environments
use the computation nodes
write batch scripts
work interactivly
This course will consist of lectures interspersed with hands-on sessions where you get to try out what you have just learned.
We aim to give this course in spring and fall every year.
Cluster-specific approaches
The course is a cooperation between UPPMAX (Rackham, Snowy, Bianca) and HPC2N (Kebnekaise) and will focus on the compute systems at both centres.
Although there are differences we will only have few seperate sessions.
Most participants will use UPPMAX’s systems for the course, as Kebnekaise is only for local (UmU, IRF, MIUN, SLU, LTU) users.
The general information given in the course will be true for all/most HPC centres in Sweden.
The examples will often have specific information, like module names and versions, which may vary. What you learn here should help you to make any changes needed for the other centres.
When present, links to the Python/Julia/R/Matlab documentation at other NAISS centres are given in the corresponding session.
Schedule Spring 2024
Day |
Language |
---|---|
Tuesday 22 October |
Python |
Wednesday 23 October |
Julia |
Thursday 24 October |
R |
Friday 25 October |
Matlab |
Some practicals
Code of Conduct
We strive to follow the Contributor Covenant Code of Conduct to foster an inclusive and welcoming environment for everyone.
In short:
Use welcoming and inclusive language
Be respectful of different viewpoints and experiences
Gracefully accept constructive criticism
Focus on what is best for the community
Show courtesy and respect towards other community members
Furthermore, as this is a hands-on, interactive workshop:
Be kind to each other and help each other as best you can.
If you can’t help someone or there is some problem, let someone know.
Finally, if you notice something that prevents you from learning as well as you can, let us know and don’t suffer silently, even the “little things”:
Volume too low?
Font size too small?
Generally confusing instructor?
Not enough breaks?
Zoom
The course is run over Zoom. You should have gotten an email with the links
There will be a zoom for the lectures
When you join the Zoom meeting, use your REAL NAME.
The lectures and demos will be recorded, but NOT the exercises.
If you ask questions during the lectures, you may thus be recorded.
If you do not wish to be recorded, then please keep your microphone muted and your camera off during lectures and write your questions in the Q/A document (see below about Microsoft-365 collaboration document).
Please MUTE your microphone when you are not speaking and use the “Raise hand” functionality under the “Participants” window during the lecture. Please do not clutter the Zoom chat. Behave politely!
There will be breakout rooms for the exercises.
You may enter there and you will get personal help
There will be a “silent” room for those who just want to sit and work
Collaboration document (Microsoft-365)
- Q/A collaboration document
Use this page for the workshop with your questions.
It becomes part of the course and some Q/A:s may be more relevant to you at a later occasion!!
Depending on how many helpers there are we’ll see how fast there are answers.
Some answers may come after the workshop day.
Hint
Main project on UPPMAX:
Project ID:
naiss2024-22-107
Directory name on rackham:
/proj/r-py-jl
Please create a suitably named subdirectory below
/proj/r-py-jl
, for your own exercises.
Local project on HPC2N:
Project ID:
hpc2n2024-025
Directory name on Kebnekaise:
/proj/nobackup/hpc2n2024-025
Please create a suitably named subdirectory below
/proj/nobackup/hpc2n2024-025
, for your own exercises.
Example of screen arrangement
If you only have one screen to work with for the course, try to fit:
╔════════════╗ ╔══════════════╗
║ ║ ║ TERMINAL ║
║ ║ ║ WINDOW ║
║ ZOOM ║ ╚══════════════╝
║ WINDOW ║ ╔══════════════╗
║ ║ ║ BROWSER ║
║ ║ ║ W/MATERIAL ║
╚════════════╝ ╚══════════════╝
Warning
Please be sure that you have gone through the `pre-requirements <https://uppmax.github.io/R-python-julia-HPC/prereqs.html>`_
It mentions the familiarity with the LINUX command line.
- The applications to connect to the clusters
terminals
remote graphical desktop ThinLinc
Material for improving your programming skills
This course does not aim to improve your coding skills.
Rather you will learn to understand the ecosystems and navigations for the the different languages on a HPC cluster.
See the pre-requirements, section Coding
Briefly about the cluster hardware and system at UPPMAX and HPC2N
What is a cluster?
Login nodes and calculations/computation nodes
A network of computers, each computer working as a node.
Each node contains several processor cores and RAM and a local disk called scratch.
The user logs in to login nodes via Internet through ssh or Thinlinc.
Here the file management and lighter data analysis can be performed.
The calculation nodes have to be used for intense computing.
The two HPC centers UPPMAX and HPC2N
Two HPC centers
There are many similarities:
Login vs. calculation/compute nodes
Environmental module system with software hidden until loaded with
module load
Slurm batch job and scheduling system
pip install
procedure
… and small differences:
commands to load Python, Python packages, R, Julia
slightly different flags to Slurm
… and some bigger differences:
UPPMAX has three different clusters
Rackham for general purpose computing on CPUs only
Snowy available for local projects and suits long jobs (< 1 month) and has GPUs
Bianca for sensitive data and has GPUs
HPC2N has Kebnekaise with GPUs
Conda is recommended only for UPPMAX users
Warning
At both HPC2N and UPPMAX we call the applications available via the module system modules.
To distinguish these modules from the python modules that work as libraries we refer to the later ones as packages.
See also
Details about the systems - https://uppmax.github.io/R-python-julia-HPC/prereqs.html#common-features - and below
Content of the course
- Introduction Python
- Load and run python
- Packages
- Isolated environments
- Running Python in batch mode
- Using GPUs with Python
- Interactive work on the compute nodes
- Introduction
- The different way HPC2N and UPPMAX provide for an interactive session
- Start an interactive session
- Check to be in an interactive session
- Check to have booked the expected amount of cores
- Running a Python script in an interactive session
- End the interactive session
- Exercises
- Exercise 0: be able to use the Python scripts
- Exercise 1
- Exercise 2
- Conclusion
- Links
- Jupyter on compute nodes
- Parallel and multithreaded functions
- Conda at UPPMAX
- Exercises
- Summary