Load and run python and use packages
- At UPPMAX, HPC2N, LUNARC, and NSC (and most other Swedish HPC centres) we call the applications available via the module system modules.
Objectives
Show how to load Python
Show how to run Python scripts and start the Python command line
Short cheat sheet
See which modules exists:
module spider
orml spider
Find module versions for a particular software:
module spider <software>
Modules depending only on what is currently loaded:
module avail
orml av
See which modules are currently loaded:
module list
orml
Load a module:
module load <module>/<version>
orml <module>/<version>
Unload a module:
module unload <module>/<version>
orml -<module>/<version>
More information about a module:
module show <module>/<version>
orml show <module>/<version>
Unload all modules except the ‘sticky’ modules:
module purge
orml purge
Warning
Note that the module systems at UPPMAX, HPC2N, LUNARC, and NSC are slightly different.
- While all modules at
UPPMAX not directly related to bio-informatics are shown by
ml avail
NSC are show by
ml avail
HPC2N and LUNARC are hidden until one has loaded a prerequisite like the compiler
GCC
.
For reproducibility reasons, you should always load a specific version of a module instead of just the default version
Many modules have prerequisite modules which needs to be loaded first (at HPC2N/LUNARC/NSC this is also the case for the Python modules). When doing
module spider <module>/<version>
you will get a list of which other modules needs to be loaded first
Check for Python versions
Tip
Type along!
Check all available Python versions with:
$ module avail python
NOTE that python is written in lower case!
Check all available version Python versions with:
$ module spider Python
To see how to load a specific version of Python, including the prerequisites, do
$ module spider Python/<version>
Example for Python 3.11.3
$ module spider Python/3.11.3
Check all available Python versions with:
$ module spider Python
To see how to load a specific version of Python, including the prerequisites, do
$ module spider Python/<version>
Example for Python 3.11.5
$ module spider Python/3.11.5
Check all available Python versions with:
$ module spider Python
To see how to load a specific version of Python, including the prerequisites, do
$ module spider Python/<version>
Example for Python 3.10.4
$ module spider Python/3.10.4
Output at UPPMAX as of May 14, 2024
----------------------------------- /sw/mf/rackham/applications ----------------------------------- python_GIS_packages/3.10.8 python_ML_packages/3.9.5-gpu wrf-python/1.3.1 python_ML_packages/3.9.5-cpu python_ML_packages/3.11.8-cpu (D) ------------------------------------ /sw/mf/rackham/compilers ------------------------------------- python/2.7.6 python/3.4.3 python/3.9.5 python3/3.6.8 python3/3.11.8 python/2.7.9 python/3.5.0 python/3.10.8 python3/3.7.2 python3/3.12.1 (D) python/2.7.11 python/3.6.0 python/3.11.4 python3/3.8.7 python/2.7.15 python/3.6.8 python/3.11.8 python3/3.9.5 python/3.3 python/3.7.2 python/3.12.1 (D) python3/3.10.8 python/3.3.1 python/3.8.7 python3/3.6.0 python3/3.11.4 Where: D: Default Module Use module spider" to find all possible modules and extensions. Use "module keyword key1 key2 ..." to search for all possible modules matching any of the "keys".
Output at HPC2N as of May 14, 2024
b-an01 [~]$ module spider Python ---------------------------------------------------------------------------- Python: ---------------------------------------------------------------------------- Description: Python is a programming language that lets you work more quickly and integrate your systems more effectively. Versions: Python/2.7.15 Python/2.7.16 Python/2.7.18-bare Python/2.7.18 Python/3.7.2 Python/3.7.4 Python/3.8.2 Python/3.8.6 Python/3.9.5-bare Python/3.9.5 Python/3.9.6-bare Python/3.9.6 Python/3.10.4-bare Python/3.10.4 Python/3.10.8-bare Python/3.10.8 Python/3.11.3 Python/3.11.5 Other possible modules matches: Biopython Boost.Python GitPython IPython flatbuffers-python ... ---------------------------------------------------------------------------- To find other possible module matches execute: $ module -r spider '.*Python.*' ---------------------------------------------------------------------------- For detailed information about a specific "Python" package (including how to load the modules) use the module's full name. Note that names that have a trailing (E) are extensions provided by other modules. For example: $ module spider Python/3.9.5 ----------------------------------------------------------------------------
Output at LUNARC as of Nov 5, 2024
$ module spider Python -------------------------------------------------------------------------------------------------------- Python: -------------------------------------------------------------------------------------------------------- Description: Python is a programming language that lets you work more quickly and integrate your systems more effectively. Versions: Python/2.7.18-bare Python/2.7.18 Python/3.8.6 Python/3.9.5-bare Python/3.9.5 Python/3.9.6-bare Python/3.9.6 Python/3.10.4-bare Python/3.10.4 Python/3.10.8-bare Python/3.10.8 Python/3.11.3 Python/3.11.5 Python/3.12.3 Other possible modules matches: Biopython GitPython IPython Python-bundle Python-bundle-PyPI bx-python flatbuffers-python ... -------------------------------------------------------------------------------------------------------- To find other possible module matches execute: $ module -r spider '.*Python.*' -------------------------------------------------------------------------------------------------------- For detailed information about a specific "Python" package (including how to load the modules) use the module's full name. Note that names that have a trailing (E) are extensions provided by other modules. For example: $ module spider Python/3.12.3 --------------------------------------------------------------------------------------------------------
Output at NSC (Tetralith) as of Nov 20, 2024
$ module spider Python #################################################################################################################################### # NOTE: At NSC the output of 'module spider' is generally not helpful as all relevant software modules are shown by 'module avail' # # Some HPC centers hide software until the necessary dependencies have been loaded. NSC does not do that. # #################################################################################################################################### ---------------------------------------------------------------------------- Python: ---------------------------------------------------------------------------- Versions: Python/recommendation Python/2.7.18-bare-hpc1-gcc-2022a-eb Python/2.7.18-bare Python/3.10.4-bare-hpc1-gcc-2022a-eb Python/3.10.4-bare Python/3.10.4-env-hpc1-gcc-2022a-eb Python/3.10.4-env-hpc2-gcc-2022a-eb Python/3.10.4 Python/3.10.8-bare Python/3.10.8 Python/3.11.3 Python/3.11.5 Other possible modules matches: IPython netcdf4-python ---------------------------------------------------------------------------- To find other possible module matches execute: $ module -r spider '.*Python.*' ---------------------------------------------------------------------------- For detailed information about a specific "Python" package (including how to load the modules) use the module's full name. Note that names that have a trailing (E) are extensions provided by other modules. For example: $ module spider Python/3.11.5 ----------------------------------------------------------------------------
Note
Unless otherwise said, we recommend using Python 3.11.x in this course at HPC2N, UPPMAX, and LUNARC. We will us Python 3.10.4 at NSC, unless otherwise said.
Load a Python module
For reproducibility, we recommend ALWAYS loading a specific module instad of using the default version!
Tip
Type along!
Go back and check which Python modules were available. To load version 3.11.8, do:
$ module load python/3.11.8
Note: Lowercase p
.
For short, you can also use:
$ ml python/3.11.8
To load Python version 3.11.3, do:
$ module load GCC/12.3.0 Python/3.11.3
Note: Uppercase P
.
For short, you can also use:
$ ml GCC/12.3.0 Python/3.11.3
To load Python version 3.11.5, do:
$ module load GCC/13.2.0 Python/3.11.5
Note: Uppercase P
.
For short, you can also use:
$ ml GCC/13.2.0 Python/3.11.5
To load Python version 3.10.4, do:
$ module load buildtool-easybuild/4.8.0-hpce082752a2 GCC/11.3.0 Python/3.10.4
Note: Uppercase P
.
For short, you can also use:
$ ml buildtool-easybuild/4.8.0-hpce082752a2 GCC/11.3.0 Python/3.10.4
Warning
UPPMAX: Don’t use system-installed python (2.7.5)
UPPMAX: Don’t use system installed python3 (3.6.8)
HPC2N: Don’t use system-installed python (2.7.18)
HPC2N: Don’t use system-installed python3 (3.8.10)
LUNARC: Don’t use system-installed python/python3 (3.9.18)
NSC: Don’t use system-installed python/python3 (3.9.18)
ALWAYS use python module
Why are there both Python/2.X.Y and Python/3.Z.W modules?
Some existing software might use Python2 and some will use Python3.
Some of the Python packages have both Python2 and Python3 versions.
Check what your software as well as the installed modules need when you pick!
UPPMAX: Why are there both python/3.X.Y and python3/3.X.Y modules?
Sometimes existing software might use python2 and there’s nothing you can do about that.
In pipelines and other toolchains the different tools may together require both python2 and python3.
Here’s how you handle that situation:
You can run two python modules at the same time if ONE of the module is
python/2.X.Y
and the other module ispython3/3.X.Y
(notpython/3.X.Y
).
LUNARC: Are python and python3 equivalent, or does the former load Python/2.X.Y?
The answer depends on which module is loaded. If Python/3.X.Y is loaded, then python is just an alias for python3 and it will start the same command line. However, if Python/2.7.X is loaded, then python will start the Python/2.7.X command line while python3 will start the system version (3.9.18). If you load Python/2.7.X and then try to load Python/3.X.Y as well, or vice-versa, the most recently loaded Python version will replace anything loaded prior, and all dependencies will be upgraded or downgraded to match. Only the system’s Python/3.X.Y version can be run at the same time as a version of Python/2.7.X.
Run
Run Python script
Hint
There are many ways to edit your scripts.
If you are rather new.
Graphical:
$ gedit <script> &
(
&
is for letting you use the terminal while editor window is open)Requires ThinLinc or
ssh -X
Terminal:
$ nano <script>
Otherwise you would know what to do!
- ⚠️ The teachers may use their common editor, like
vi
/vim
If you get stuck in
vim
, press:<esc>
and then:q
!
- ⚠️ The teachers may use their common editor, like
Type-Along
Let’s make a script with the name
example.py
$ nano example.py
Insert the following text
# This program prints Hello, world!
print('Hello, world!')
Save and exit. In nano:
<ctrl>+O
,<ctrl>+X
You can run a python script in the shell like this:
$ python example.py
# or
$ python3 example.py
Warning
ONLY run jobs that are short and/or do not use a lot of resources from the command line.
Otherwise use the batch system (see the batch session)
Run an interactive Python shell
You can start a simple python terminal by:
$ python
Example
>>> a=3
>>> b=7
>>> c=a+b
>>> c
10
Exit Python with <Ctrl-D>,
quit()
orexit()
in the python prompt
>>> <Ctrl-D>
>>> quit()
>>> exit()
For more interactiveness you can run Ipython.
Tip
Type along!
NOTE: remember to load a python module first. Then start IPython from the terminal
$ ipython
or
$ ipython3
UPPMAX has also jupyter-notebook
installed and available from the loaded Python module. Start with
$ jupyter-notebook
You can decide on your own favorite browser and add --no-browser
and open the given URL from the output given.
From python/3.10.8 and forward, also jupyterlab is available.
NOTE: remember to load an IPython module first. You can see possible modules with
$ module spider IPython
And load one of them (here 8.14.0) with
$ ml GCC/12.3.0 IPython/8.14.0
Then start Ipython with (lowercase):
$ ipython
HPC2N also has JupyterLab
installed. It is available as a module, but the process of using it is somewhat involved. We will cover it more under the session on <a href=”https://uppmax.github.io/HPC-python/interactive.html”>Interactive work on the compute nodes</a>. Otherwise, see this tutorial:
NOTE: remember to load an IPython module first. You can see possible modules with
$ module spider IPython
And load one of them (here 8.14.0) with
$ ml GCC/12.3.0 IPython/8.14.0
Then start Ipython with (lowercase):
$ ipython
LUNARC also has JupyterLab
, JupyterNotebook
, and JupyterHub
installed.
NOTE: remember to load an IPython module first. You can see possible modules with
$ module spider IPython
And load one of them (here 8.5.0) with
$ ml buildtool-easybuild/4.8.0-hpce082752a2 GCC/11.3.0 IPython/8.5.0
Then start Ipython with (lowercase):
$ ipython
Exit IPython with <Ctrl-D>,
quit()
orexit()
in the python prompt
iPython
In [2]: <Ctrl-D>
In [12]: quit()
In [17]: exit()
Packages/Python modules
Python modules AKA Python packages
Python packages broaden the use of python to almost infinity!
Instead of writing code yourself there may be others that have done the same!
Many scientific tools are distributed as python packages, making it possible to run a script in the prompt and there define files to be analysed and arguments defining exactly what to do.
A nice introduction to packages can be found here: Python for scientific computing
Questions
How do I find which packages and versions are available?
What to do if I need other packages?
Are there differences between HPC2N, LUNARC, UPPMAX, and NSC?
Objectives
Show how to check for Python packages
show how to install own packages on the different clusters
Check current available packages
General for all four centers
Some python packages are working as stand-alone tools, for instance in bioinformatics. The tool may be already installed as a module. Check if it is there by:
$ module spider <tool-name or tool-name part>
Using module spider
lets you search regardless of upper- or lowercase characters and regardless of already loaded modules (like GCC
on HPC2N and bioinfo-tools
on UPPMAX).
Check the pre-installed packages of a specific python module:
$ module help python/<version>
At HPC2N, a way to find Python packages that you are unsure how are names, would be to do
$ module -r spider ’.*Python.*’
or
$ module -r spider ’.*python.*’
Do be aware that the output of this will not just be Python packages, some will just be programs that are compiled with Python, so you need to check the list carefully.
At LUNARC, a way to find Python packages that you are unsure how are names, would be to do
$ module -r spider ’.*Python.*’or
$ module -r spider ’.*python.*’Do be aware that the output of this will not just be Python packages, some will just be programs that are compiled with Python, so you need to check the list carefully.
At NSC, a way to find Python packages that you are unsure how are names, would be to do
$ module -r spider ’.*Python.*’or
$ module -r spider ’.*python.*’Do be aware that the output of this will not just be Python packages, some will just be programs that are compiled with Python, so you need to check the list carefully.
Check the pre-installed packages of a loaded python module, in shell:
$ pip list
To see which Python packages you, yourself, has installed, you can use pip list --user
while the environment you have installed the packages in are active.
You can also test from within python to make sure that the package is not already installed:
>>> import <package>
Does it work? Then it is there!
Otherwise, you can either use pip
or conda
.
Check path to the package you are using,
In a python session, type:
import [a_module]
print([a_module].__file__)
The print-out tells you the path to the .pyc file, but should give you a hint where it belongs.
Check packages (5 min)
See if the following packages are installed. Use python version
3.11.8
on Rackham,3.11.3
on Kebnekaise, and3.11.5
on Cosmos/Tetralith (remember: the Python module on kebnekaise/cosmos/tetralith has prerequisite(s)).numpy
mpi4py
distributed
multiprocessing
time
dask
Solution
- Rackham has for ordinary python/3.11.8 module already installed:
numpy
✅pandas
✅mpi4py
❌distributed
❌multiprocessing
✅ (standard library)time
✅ (standard library)dask
✅
- Kebnekaise has for ordinary Python/3.11.3 module already installed:
numpy
❌pandas
❌mpi4py
❌distributed
❌multiprocessing
✅ (standard library)time
✅ (standard library)dask
❌
- Cosmos has for ordinary Python/3.11.5 module already installed:
numpy
❌pandas
❌mpi4py
❌distributed
❌multiprocessing
✅ (standard library)time
✅ (standard library)dask
❌
- Tetralith has for ordinary Python/3.10.4 module already installed:
numpy
❌pandas
❌mpi4py
❌distributed
❌multiprocessing
✅ (standard library)time
✅ (standard library)dask
❌
See next session how to find more pre-installed packages!
NOTE: at HPC2N, LUNARC, and NSC, the available Python packages needs to be loaded as modules/module-bundles before using! See a list of some of them below, under the HPC2N/LUNARC/NSC tab or find more as mentioned above, using module spider -r ...
A selection of the Python packages and libraries installed on UPPMAX, HPC2N, LUNARC, and NSC are given in extra reading: UPPMAX clusters and Kebnekaise cluster and `LUNARC cluster <>`_ and
The python application at UPPMAX comes with several preinstalled packages.
You can check them here: UPPMAX packages.
In addition there are packages available from the module system as python tools/packages
Note that bioinformatics-related tools can be reached only after loading
bioinfo-tools
.Two modules contains topic specific packages. These are:
Machine learning:
python_ML_packages
(cpu and gpu versions and based on python/3.9.5 and python/3.11.8)GIS:
python_GIS_packages
(cpu version based on python/3.10.8)
The python application at HPC2N comes with several preinstalled packages - check first before installing yourself!
HPC2N has both Python 2.7.x and Python 3.x installed.
We will be using Python 3.x in this course. For this course, the recommended version of Python to use on Kebnekaise is 3.11.3.
NOTE: HPC2N do NOT recommend (and do not support) using Anaconda/Conda on our systems. You can read more about this here: Anaconda.
This is a selection of the packages and libraries installed at HPC2N. These are all installed as modules and need to be loaded before use.
ASE
Keras
PyTorch
SciPy-bundle
(Bottleneck, deap, mpi4py, mpmath, numexpr, numpy, pandas, scipy - some of the versions have more)TensorFlow
Theano
matplotlib
scikit-learn
scikit-image
iPython
Cython
Flask
JupyterLab
Python-bundle-PyPI
(Bundle of Python packages from PyPi)
The python application at LUNARC comes with several preinstalled packages - check first before installing yourself!
LUNARC has both Python 2.7.x and Python 3.x installed.
We will be using Python 3.x in this course. For this course, the recommended version of Python to use on Cosmos is 3.11.5.
This is a selection of the packages and libraries installed at LUNARC. These are all installed as modules and need to be loaded before use.
PyTorch
SciPy-bundle
(Bottleneck, deap, mpi4py, mpmath, numexpr, numpy, pandas, scipy - some of the versions have more)TensorFlow
matplotlib
scikit-learn
scikit-image
iPython
Cython
Biopython
JupyterLab
Python-bundle
(NumPy, SciPy, Matplotlib, JupyterLab, MPI4PY, …)
The python application at NSC (Tetralith) comes with very few preinstalled packages, but many can be found in extra modules - check first before installing yourself!
NSC has both Python 2.7.x and Python 3.x installed.
We will be using Python 3.x in this course. For this course, the recommended version of Python to use on Tetralith is 3.10.4.
This is a selection of the packages and libraries installed at NSC (Tetralith). These are all installed as modules and need to be loaded before use.
SciPy-bundle
(Bottleneck, deap, mpi4py, mpmath, numexpr, numpy, pandas, scipy - some of the versions have more)matplotlib
iPython
Demo/Type-along
This is an exercise that combines loading, running, and using site-installed packages. Later, during the ML session, we will look at running the same exercise, but as a batch job. There is also a follow-up exercise of an extended version of the script, if you want to try run that as well (see further down on the page).
Note
You need the data-file scottish_hills.csv
which can be found in the directory Exercises/examples/programs
. If you have cloned the git-repo for the course, or copied the tar-ball, you should have this directory. The easiest thing to do is just change to that directory and run the exercise there.
Since the exercise opens a plot, you need to login with ThinLinc (or otherwise have an x11 server running on your system and login with ssh -X ...
).
The exercise is modified from an example found on https://ourcodingclub.github.io/tutorials/pandas-python-intro/.
Warning
Not relevant if using UPPMAX. Only if you are using HPC2N, LUNARC, or NSC!
You need to also load Tkinter.
For HPC2N:
ml GCC/12.3.0 Python/3.11.3 SciPy-bundle/2023.07 matplotlib/3.7.2 Tkinter/3.11.3
For LUNARC
ml GCC/13.2.0 Python/3.11.5 SciPy-bundle/2023.11 matplotlib/3.8.2 Tkinter/3.11.5
For NSC (Tetralith)
ml buildtool-easybuild/4.8.0-hpce082752a2 GCC/11.3.0 OpenMPI/4.1.4 Python/3.10.4 SciPy-bundle/2022.05 matplotlib/3.5.2 Tkinter/3.10.4
In addition, you need to add the following two lines to the top of your python script/run them first in Python, for HPC2N, LUNARC, and NSC:
import matplotlib
matplotlib.use('TkAgg')
Python example with packages pandas and matplotlib
We are using Python version 3.11.x
except on Tetralith where we use Python/3.10.4. To access the packages pandas
and matplotlib
, you may need to load other modules, depending on the site where you are working.
Here you only need to load the
python
module, as the relevant packages are included (as long as you are not using GPUs, but that is talked about later in the course). Thus, you just do:
$ ml python/3.11.8
On Kebnekaise you also need to load SciPy-bundle
and matplotlib
(and their prerequisites). These versions will work well together (and with the Tkinter/3.11.3):
$ ml GCC/12.3.0 Python/3.11.3 SciPy-bundle/2023.07 matplotlib/3.7.2
On Cosmos you also need to load SciPy-bundle
and matplotlib
(and their prerequisites). These versions will work well together (and with the Tkinter/3.11.5):
$ ml GCC/13.2.0 Python/3.11.5 SciPy-bundle/2023.11 matplotlib/3.8.2
On Tetralith you also need to load SciPy-bundle
and matplotlib
(and their prerequisites). In this example we will use Python 3.10.4 as that is the one that has compatible versions and has a compatible TKinter 3.10.4):
$ ml buildtool-easybuild/4.8.0-hpce082752a2 GCC/11.3.0 OpenMPI/4.1.4 matplotlib/3.5.2 SciPy-bundle/2022.05 Tkinter/3.10.4
From inside Python/interactive (if you are on Kebnekaise/Cosmos/Tetralith, mind the warning above about loading a compatible Tkinter and adding the two lines importing matplotlib and setting TkAgg at the top):
Start python and run these lines:
import pandas as pd
import matplotlib.pyplot as plt
dataframe = pd.read_csv("scottish_hills.csv")
x = dataframe.Height
y = dataframe.Latitude
plt.scatter(x, y)
plt.show()
If you change the last line to
plt.savefig("myplot.png")
then you will instead get a filemyplot.png
containing the plot. This is what you would do if you were running a python script in a batch job.- On UPPMAX, LUNARC, and NSC you can view png files with the program
eog
Test:
eog myplot.png &
- On UPPMAX, LUNARC, and NSC you can view png files with the program
- On HPC2N you can view png files with the program
eom
Test:
eom myplot.png &
- On HPC2N you can view png files with the program
As a Python script (if you are on Kebnekaise/Cosmos/Tetralith, mind the warning above about Tkinter):
Copy and save this script as a file (or just run the file
pandas_matplotlib-<system>.py
that is located in the<path-to>/Exercises/examples/programs
directory you got from the repo or copied. Where <system> is eitherrackham
,kebnekaise
,cosmos
, ortetralith
.import pandas as pd import matplotlib.pyplot as plt dataframe = pd.read_csv("scottish_hills.csv") x = dataframe.Height y = dataframe.Latitude plt.scatter(x, y) plt.show()
import pandas as pd import matplotlib import matplotlib.pyplot as plt matplotlib.use('TkAgg') dataframe = pd.read_csv("scottish_hills.csv") x = dataframe.Height y = dataframe.Latitude plt.scatter(x, y) plt.show()
import pandas as pd import matplotlib import matplotlib.pyplot as plt matplotlib.use('TkAgg') dataframe = pd.read_csv("scottish_hills.csv") x = dataframe.Height y = dataframe.Latitude plt.scatter(x, y) plt.show()
import pandas as pd import matplotlib import matplotlib.pyplot as plt matplotlib.use('TkAgg') dataframe = pd.read_csv("scottish_hills.csv") x = dataframe.Height y = dataframe.Latitude plt.scatter(x, y) plt.show()
If you have time, you can also try and run these extended versions, which also requires the scipy
packages (included with python at UPPMAX and with the same modules loaded as for pandas
for HPC2N/LUNARC/NSC):
Exercises (C. 10 min)
Python example that requires pandas
, matplotlib
, and scipy
packages.
You can either save the scripts or run them line by line inside Python. The scripts are also available in the directory <path-to>/Exercises/examples/programs
, as pandas_matplotlib-linreg.py
and pandas_matplotlib-linreg-pretty.py
.
NOTE that there are separate versions for rackham, kebnekaise, cosmos, and tetralith and that you for kebnekaise, cosmos, and tetralith need to again add the same lines regarding TkAgg as mentioned under the warning before the previous exercise.
Remember that you also need the data file scottish_hills.csv
located in the above directory.
Examples are from https://ourcodingclub.github.io/tutorials/pandas-python-intro/
pandas_matplotlib-linreg.py
import pandas as pd
import matplotlib.pyplot as plt
from scipy.stats import linregress
dataframe = pd.read_csv("scottish_hills.csv")
x = dataframe.Height
y = dataframe.Latitude
stats = linregress(x, y)
m = stats.slope
b = stats.intercept
plt.scatter(x, y)
plt.plot(x, m * x + b, color="red") # I've added a color argument here
plt.show()
pandas_matplotlib-linreg-pretty.py
import pandas as pd
import matplotlib.pyplot as plt
from scipy.stats import linregress
dataframe = pd.read_csv("scottish_hills.csv")
x = dataframe.Height
y = dataframe.Latitude
stats = linregress(x, y)
m = stats.slope
b = stats.intercept
# Change the default figure size
plt.figure(figsize=(10,10))
# Change the default marker for the scatter from circles to x's
plt.scatter(x, y, marker='x')
# Set the linewidth on the regression line to 3px
plt.plot(x, m * x + b, color="red", linewidth=3)
# Add x and y lables, and set their font size
plt.xlabel("Height (m)", fontsize=20)
plt.ylabel("Latitude", fontsize=20)
# Set the font size of the number lables on the axes
plt.xticks(fontsize=18)
plt.yticks(fontsize=18)
plt.show()
Keypoints
Before you can run Python scripts or work in a Python shell, first load a python module and probable prerequisites
Start a Python shell session either with
python
oripython
Run scripts with
python3 <script.py>
You can check for packages
from the Python shell with the
import
commandfrom BASH shell with the
pip list
command at all three centersml help python/<version>
at UPPMAX
Installation of Python packages can be done either with PYPI or Conda
You install own packages with the
pip install
command (This is the recommended way on HPC2N)At UPPMAX, LUNARC, and NSC Conda is also available (See Conda section)