Packages

  • practice to determine the version of a Python package

  • practice to determine that a Python package is not installed

  • practice to have loaded a Python machine learning module

  • practice to install a Python package

Compute allocations in this workshop

  • Rackham: naiss2024-22-107

  • Kebnekaise: hpc2n2024-025

Storage space for this workshop

  • Rackham: /proj/r-py-jl

  • Kebnekaise: /proj/nobackup/hpc2n2024-025

Introduction

Packages are pieces of Python code written to be used by others. When possible, using an existing Python package is usually smarter than writing code yourself. In this session, we practice working with packages.

Finding packages

flowchart TD
  using_python_packages[Using Python packages]
  installed_with_python[Installed with Python module?]
  use_python[Load Python module]
  installed_with_module[Installed with Python package module?]
  use_module[Load Python package module]
  install_yourself[Install it yourself]

  using_python_packages --> installed_with_python
  installed_with_python -->|yes|use_python
  installed_with_python -->|no|installed_with_module
  installed_with_module -->|yes|use_module
  installed_with_module -->|no|install_yourself

The most common Python packages come installed when loading a regular Python module. Some of the more complex packages, are part of a module for more complex Python packages. If a package is not installed, however, you can also install it.

Python package installers

flowchart TD
  python[Python]
  packages[packages]
  code_written_by_others[code written by others]
  package_install_system[package installation system]
  pip[pip]
  conda[conda]
  uppmax[UPPMAX]
  hpc2n[HPC2N]

  python -->|has| packages
  packages -->|are|code_written_by_others
  packages -->|installed by|package_install_system
  package_install_system --> pip
  package_install_system --> conda
  conda -->|works on|uppmax
  pip -->|works on|uppmax
  pip -->|works on|hpc2n

There are two Python package installers, called conda and pip.

In this session, we use pip, as it can be used on the two HPC clusters used in this course:

Package installer

HPC2N

UPPMAX (Rackham)

UPPMAX (Bianca)

conda

Unsupported [1]

Recommended

Recommended

pip

Recommended

Supported

Unsupported [2]

In this session we use pip, because it is a commonly-used package installation system that works on both HPC clusters used in this course.

We have not scheduled to discuss Conda in this course, yet teaching materials can be found at Conda at UPPMAX.

As a first impression, here is a simple comparison between the two:

Parameter

conda

pip

Installs Python packages

Yes

Yes

Installs non-Python software

Yes

No

In this session, we will install packages to your default user folder. Because this one default user folder, installing a different version of one package for one computational experiment, may have consequences for others. These problems are addressed in the session on isolated environments.

Exercises

These exercises follow a common user journey, for a user that needs to use a certain Python packages:

  • In exercise 1, we determine if a Python package is already installed

  • In exercise 2, we determine if a machine learning Python package is already installed

  • If all fails, in exercise 3, we install a Python package ourselves

Like any user, we’ll try to be autonomous and read the -hopefully well written!- UPPMAX documentation.

Exercise 1

Learning objectives

  • Practice reading documentation

  • Apply/rehearse the documentation to load a module

  • Apply the documentation to show if a Python package is already installed

  • Observe how it looks like when a package is not installed

Imagine you want to use the Python packages pandas and tensorflow-cpu and mhcnuggets. Here we see that one comes already installed with the module system.

Read the UPPMAX documentation on how to load Python.

Then do:

  • HPC2N: load the modules GCC/12.3.0 and Python 3.11.3

  • UPPMAX: load the module python/3.11.8

Read the UPPMAX documentation on how to determine if a Python package comes with your Python module.

Is the Python package pandas installed? If yes, which version?

Is the Python package tensorflow-cpu installed? If yes, which version?

Is the Python package mhcnuggets installed? If yes, which version?

Exercise 2

Learning objectives

  • Practice reading documentation

  • Rehearse the documentation to load a Python machine learning module

  • Apply the documentation to show if a Python package is already installed

  • Observe how it looks like when a package is not installed

Imagine you want to use the Python packages pandas and tensorflow-cpu and mhcnuggets. Here we see that two come already installed with a Python machine learning module.

Read:

Do:

  • UPPMAX: Which of the versions should you use? Load the latest Python machine learning module for that version.

  • HPC2N: Load the latest module

Read the UPPMAX documentation on how to determine if a Python package comes with your Python module.

Is the Python package pandas installed? If yes, which version?

Answer:

  • HPC2N: Is the Python package tensorflow-cpu installed? If yes, which version?

  • UPPMAX: Is the Python package tensorflow-cpu installed? If yes, which version?

Is the Python package mhcnuggets installed? If yes, which version?

Exercise 3

Learning objectives

  • Practice reading documentation

  • Install a new package.

  • Rehearse determining if a Python package is already installed

Imagine you want to use the Python packages pandas and tensorflow-cpu and mhcnuggets. Even when loading a bigger module, one of the packages was not installed for us. Here we install a Python package ourselves.

Read the UPPMAX documentation on how to install Python packages using pip.

We will be using the first install with --user.

In which folder do the Python packages end up?

Try to come up with a reason why would this be important to know.

Install the package mhcnuggets.

Confirm that the Python package mhcnuggets is installed now. Which version has been installed?

Conclusion

Keypoints

You have:

  • determined if a Python package is installed yes/no using pip

  • discovered some Python package are already installed upon loading a module

  • installed a Python package using pip

However, the installed package was put into a shared (as in, not isolated) environment.

Luckily, isolated environments are discussed in this course too :-)