Summary
Keypoints
- Load and run and pre-installed packages
Use python from module system
Start a Python shell session either with
python
oripython
run scripts with
python3 <script.py>
- Check for preinstalled packages
from the Python shell with the
import
command- from BASH shell with the
pip list
command at both centersml help python/3.11.8
at UPPMAXmodule -r spider '.*Python.*'
at HPC2N
- Install packages and use isolated environments
With a virtual environment you can tailor an environment with specific versions for Python and packages, not interfering with other installed python versions and packages.
Make it for each project you have for reproducibility.
- There are different tools to create virtual environemnts.
virtualenv
andvenv
install packages with
pip
.the flag
--system-site-packages
includes preinstalled packages as well
- (At UPPMAX Conda is also available)
- Conda is an installer of packages but also bigger toolkits
Conda creates isolated environments as well - requires that you install all packages needed.
Rackham: Pip or secondary conda
Bianca: conda and secondary wharf + (pip or conda)
- Batch mode
The SLURM scheduler handles allocations to the calculation nodes
Batch jobs runs without interaction with user
A batch script consists of a part with SLURM parameters describing the allocation and a second part describing the actual work within the job, for instance one or several Python scripts.
Remember to include possible input arguments to the Python script in the batch script.
- Interactive work on calculation nodes
- Start an interactive session on a calculation node by a SLURM allocation (similar flags)
At HPC2N:
salloc
…At UPPMAX:
interactive
…
Follow the same procedure as usual by loading the Python module and possible prerequisites.
- Parallel
You deploy cores and nodes via SLURM, either in interactive mode or batch
In Python, threads, distributed and MPI parallelization and DASK can be used.
- GPUs
You deploy GPU nodes via SLURM, either in interactive mode or batch
In Python the numba package is handy
- Machine Learning
At all clusters you will find PyTorch, TensorFlow, Scikit-learn
- The loading are slightly different at the clusters
UPPMAX: All tools are available from the module
python_ML_packages/3.11.8
- HPC2N:
For TensorFlow
module load GCC/11.3.0 OpenMPI/4.1.4 TensorFlow/2.11.0-CUDA-11.7.0 scikit-learn/1.1.2
For the rest:
module load GCC/12.3.0 OpenMPI/4.1.5 SciPy-bundle/2023.07 matplotlib/3.7.2 PyTorch/2.1.2 scikit-learn/1.3.1
See also
Note
Julia language becomes increasingly popular.