Python packages

  • navigate the documentation

  • determine which Python packages are installed

  • load a module that adds more pre-installed Python packages

  • install a Python package

Compute allocations in this workshop

  • Rackham: naiss2024-22-1202

  • Kebnekaise: hpc2n2024-114

  • Cosmos: lu2024-7-80

Storage space for this workshop

  • Rackham: /proj/r-py-jl-m-rackham

  • Kebnekaise: /proj/nobackup/r-py-jl-m

Introduction

Packages are pieces of Python code written to be used by others. When possible, using an existing Python package is usually smarter than writing code yourself. In this session, we practice working with packages.

Finding packages

flowchart TD
  using_python_packages[Using Python packages]
  installed_with_python[Installed with Python module?]
  use_python[1.Done]
  installed_with_module[Installed with Python package module?]
  use_module[2.Load Python module]
  install_yourself[3.Install it yourself]

  using_python_packages --> installed_with_python
  installed_with_python -->|yes|use_python
  installed_with_python -->|no|installed_with_module
  installed_with_module -->|yes|use_module
  installed_with_module -->|no|install_yourself

The most common Python packages come installed when loading a regular Python module. Some of the more complex packages, are part of a module for more complex Python packages. If a package is not installed, however, you can also install it.

Python package installers

flowchart TD
  python[Python]
  packages[packages]
  code_written_by_others[code written by others]
  package_install_system[package installation system]
  pip[pip]
  conda[conda]
  uppmax[UPPMAX's Rackham]
  hpc2n[HPC2N]
  lunarc[LUNARC]

  python -->|has| packages
  packages -->|are|code_written_by_others
  packages -->|installed by|package_install_system
  package_install_system --> pip
  package_install_system --> conda
  conda -->|works on|uppmax
  conda -->|works on|lunarc
  pip -->|works on|uppmax
  pip -->|works on|hpc2n
  pip -->|works on|lunarc

There are two Python package installers, called conda and pip.

In this session, we use pip, as it can be used on all the HPC clusters used in this course:

Package installer

HPC2N

LUNARC

UPPMAX’s Rackham

conda

Unsupported

Recommended

Supported

pip

Recommended

Supported

Supported

In this session we use pip, because it is a commonly-used package installation system that works on all HPC clusters used in this course. The use of conda (and its differences with pip) can be read at this course’s ‘Extra Reading’ section Conda at UPPMAX.

In this session, we will install packages to your default user folder. Because this one default user folder, installing a different version of one package for one computational experiment, may have consequences for others. These problems are addressed in the session on isolated environments.

Exercises

These exercises follow a common user journey, for a user that needs to use a certain Python packages:

  • In exercise 1, we use a Python package that comes with the Python module

  • In exercise 2, we use a Python package that comes with a software module

  • In exercise 3, we install a Python package ourselves

Like any user, we’ll try to be autonomous and read your favorite HPC center’s documentation.

Exercise 1: loading a Python package that comes with the Python module

Learning objectives

  • Practice reading documentation

  • Apply/rehearse the documentation to load a module

  • Apply the documentation to show if a Python package is already installed

Some Python packages come with loading a Python module. Here we see this in action.

For this exercise, use the documentation of your HPC center:

Load the Python module of the correct version, including prerequisite modules if needed:

Center

Python version

HPC2N

3.11.3

LUNARC

3.11.3

UPPMAX

3.11.8

How to determine if a Python package is installed?

The Python package wheel is known to be installed. Which version?

Exercise 2: loading a Python package that comes with a module

Learning objectives

  • Practice reading documentation

  • Load a Python package module

Some Python packages need another module to be loaded. In this exercise, we search for and use a module to use a pre-installed Python package. The Python package we use differs by center:

  • HPC2N: Theano, as a Python 3.7.4 package

  • LUNARC: matplotlib version 3.8.2

  • UPPMAX: TensorFlow, as a Python 3.11.8 package for CPU

Try to find your center’s documentation to find out which module to load your Python package with.

Load the module for the Python package and verify if it is loaded.

Exercise 3

Learning objectives

  • Practice reading documentation

  • Install a new package.

Some Python packages are not pre-installed on your HPC cluster. Here we install a Python package ourselves.

Use your center’s documentation to find out how to install Python packages using pip.

Install a Python package called mhcnuggets. Which version gets installed?

Conclusion

Keypoints

You have:

  • determined if a Python package is installed yes/no using pip

  • discovered some Python package are already installed upon loading a module

  • installed a Python package using pip

However, the installed package was put into a shared (as in, not isolated) environment.

Luckily, isolated environments are discussed in this course too :-)