Isolated environments

  • remember there are multiple virtual environment managers

  • practice to create, activate, work in and deactive a virtual environment

Compute allocations in this workshop

  • Rackham: naiss2024-22-1202

  • Kebnekaise: hpc2n2024-114

  • Cosmos: lu2024-7-80

Storage space for this workshop

  • Rackham: /proj/r-py-jl-m-rackham

  • Kebnekaise: /proj/nobackup/r-py-jl-m

  • COSMOS: Does not have a storage, use your home folder

Introduction

Different experiments may need different versions of Python and/or Python packages. Virtual environments allow one to work with multiple sets of (potentially incompatible) packages, where each set is independent and isolated.

Additionally, you may want to have a reproducible computational environment, so that others can reproduce your computational experiments. Virtual environments can be exported and imported to provide for better reproducible computational environments.

In this session, we create, activate, use, deactivate, export and import some virtual environments.

Virtual environment managers

flowchart TD
  python[Python]
  package_versions[combination of package versions]
  virtual_environment_manager[virtual environment manager]
  create_isolated_environments[create isolated environments]
  venv[venv virtualenv]
  conda[conda]
  hpc2n[HPC2N]
  lunarc[LUNARC]
  uppmax[UPPMAX's Rackham]

  python -->|has| package_versions
  package_versions -->|managed by|virtual_environment_manager
  virtual_environment_manager --> |has goal|create_isolated_environments
  package_versions -.- create_isolated_environments
  virtual_environment_manager --> |among others|conda
  virtual_environment_manager --> |among others|venv

  conda -->|works on|uppmax
  conda -->|works on|lunarc
  venv -->|works on|uppmax
  venv -->|works on|hpc2n
  venv -->|works on|lunarc

In this course, we will look at the following environment managers:

Manager

HPC2N

LUNARC

UPPMAX’s Rackham

Scope

conda

Avoid

OK

Avoid

Language agnostic

venv

Recommended

OK

Recommended

Python only

Although venv has an official Python ‘Virtual Environments and Packages’ tutorial, most centers have their documentation on virtual environment managers with information specific to its clusters:

In this session, we use venv, as it works for all centers.

General workflow

flowchart TD
  create[Create]
  activate[Activate]
  use[Use]
  deactivate[Deactivate]

  create --> activate
  activate --> use
  use --> deactivate
  deactivate --> activate

Whatever environment manager you use, this is the workflow:

  • You create the isolated environment

  • You activate the environment

  • You work in the isolated environment. Here you install (or update) the environment with the packages you need

  • You deactivate the environment after use

A virtual environment can be created in multiple ways, for example, from scratch. However, there are more efficient ways, which we will use.

Exercises

Need a video?

You can see a video on how these exercises are done here:

In these exercises, we first make sure we are using isolated environments, after we create, activate, use and deactivate one.

Exercise 1: remove the Python packages installed in the home folder

In the previous session, we have installed Python packages in the home folder. This will interfere with our virtual environments.

To make sure your virtual environments work, ruthlessly delete the Python packages in your home folder:

rm -Ir ~/.local/lib/python3.11

You will be asked to confirm.

This works for all centers.

Exercise 2: work with vpyenv

  • Create a Python virtual environment from a step-by-step instruction

In this exercise, we create the course environment vpyenv in a step-by-step fashion:

flowchart TD
  load_modules[1.Load modules]
  create[2.Create]
  activate[3.Activate]
  install_libraries[4.Install Python libraries]
  check[5.Check installed Python libraries]
  deactivate[6.Deactivate]

  load_modules --> create
  create --> activate
  activate --> install_libraries
  install_libraries --> check
  check --> deactivate
  deactivate --> activate

We create the virtual environment needed for this course, called vpyenv. As virtual environments can take up a lot of disc space, we create it in the course project folder.

Exercise 1.1: load the modules needed

module load GCC/12.3.0 Python/3.11.3 SciPy-bundle/2023.07 matplotlib/3.7.2

This virtual environment will be used in later sessions too and is assumed to contain the seaborn Python package. The SciPy-bundle/2023.07 module assures it is present.

Exercise 1.2: create the virtual environment

Create the virtual environment called vpyenv as such:

python -m venv --system-site-packages /proj/nobackup/r-py-jl-m/[username]/python/vpyenv

where [username] is your HPC2N username, for example python -m venv --system-site-packages /proj/nobackup/r-py-jl-m/sven/python/vpyenv.

Exercise 1.3: activate the virtual environment

Activate the virtual environment called vpyenv as such:

source /proj/nobackup/r-py-jl-m/[username]/python/vpyenv/bin/activate

where [username] is your HPC2N username, for example python -m venv --system-site-packages /proj/nobackup/r-py-jl-m/sven/python/vpyenv.

This virtual environment will be used in later sessions too.

Exercise 1.4: install Python packages

Install the seaborn package:

pip install --no-cache-dir --no-build-isolation seaborn

Exercise 1.5: check if the Python packages are installed

pip list

To see which Python packages you have installed yourself (i.e. not loaded from a module), use:

pip list --user

Exercise 1.6: deactivate the virtual environment

deactivate

Well done, you’ve just created a virtual environment called vpyenv that has seaborn installed!

More exercises

More exercises can be found at here.

Conclusion

Keypoints

You have:

  • heard that virtual environments allows one for independent and isolated set of Python packages

  • heard that there are multiple virtual environments managers:
    • UPPMAX: Conda and venv

    • HPC2N has venv

  • created, activated, used and deactivated virtual environments

You may:

  • consider to create a virtual environment per project, to provide for better reproducibility