Deploy your code¶
Questions
- How to make your program work for others?
Content
- We will prepare for use of your code
- But also...
- some theory of packages
- some theory of workflows
- some theory of containers
- get some hands on
Learning objectives of 'Deployment'
- learners can mentalize the installation needs from the users perspective
- learners can evaluate different available tools for reproducibility and installations
- learners can prepare for different users: local computer, Cluster
Instructor notes
Prerequisites are:
- ...
Lesson Plan: FIX
- Total 30 min
- Theory 20
- Discussions 10 min
TOC
- Overview
- Recording dependencies
- workflows
- containers
- Make a package
Introduction¶
- It's about Distribution!
Note
- Many projects/scripts start as something for personal use, but expands to be distributed.
- Let's start in that end and be prepared.
- The following steps can be very valuable for you in a couple of months as well as you revisit your code and don't know what it does or why you did this and that.
Attention
- Make your program or workflow work for others and yourself in the future.
- Let people understand how to use your program/tool
To make sure...¶
- Start with empty environent
- good to do this from beginning
- Nowadays platforms are less important, still "system files" may be differ among OS platforms and Linux distributions
- will your program require specific "system files"
- are these typically not installed already?
- in the best world test on Windows/Mac and Linux platforms
- and with as empty as possible environment
- What about Shared services like a cluster where users and most staff do not have writing privileges ('sudo' rights) for system installations?
Discussion: Where do you run your program?
- From a terminal?
- On different computers?
- On a cluster?
Discussion: One-time usage towards distributed package
- Have others used your code?
- Did you plan it from beginning?
- Did you take actions somehow?
Recording dependencies¶
- Reproducibility: We can control our code but how can we control dependencies?
- 10-year challenge: Try to build/run your own code that you have created 10 (or less) years ago. Will your code from today work in 5 years if you don’t change it?
- Dependency hell: Different codes on the same environment can have conflicting dependencies.
Conda, Anaconda, pip, Virtualenv, Pipenv, pyenv, Poetry, requirements.txt …¶
These Python-related tools try to solve the following problems:
- Defining a specific set of dependencies, possibly with well-defined versions
- Installing those dependencies mostly automatically
- Recording the versions for all dependencies
- Isolated environments
- On your computer for projects so they can use different software.
- Isolate environments on computers with many users (and allow self-installations)
- Using different Python/R versions per project??
- Provide tools and services to share packages
Python packaging. - Make Python packages of your code.
- Possibilities for other languages can be
- C/C+
- CMake
- Conda
- Fortran
- Fortran package manager
- Julia
- Pkg.jl
- C/C+
Course advertisement Python for scientific computing
Containers¶
Popular container implementations:
- Docker
- Singularity (popular on high-performance computing systems)
- Apptainer (popular on high-performance computing systems, fork of Singularity)
- Docker images can be converted to Singularity/Apptainer images
-
Singularity Python can convert Docker files to Singularity definition files
- Containers in the extra material
Workflows¶
See also
Learn more Workflow management by CodeRefinery Snakemake by CodeRefinery
Misc¶
- Make a file executable by its own
Example Python¶
- make a header so that user can decide wich python to use
- especially important on a shared system where python is not in the typical /usr/bin/python path.
- This line helps in the top of the main script:
Record our environment for other users¶
Principle using python pip in a virtual environment¶
- We can make other users aware of the dependencies for our Python project.
- One can state those specifically as a list in a README
- Or, we can make a ready file (in python)
Save your requirements as a file
- You may have developed your Python program with your existing python modules environent. You may have installed some new packages during the development but did not track it in a good way.
- We need to identify what python packages a user (or you on another computer) will need, to make the program work!
- There are many packages distributed in the "base" installation of Python so it is not just to look at the import lines in the code.
- You may also be hard to get an overview because you have too many import lines, also distributed among files if you worked in a modular way
-
So here are some steps:
-
Start a python virtual environment.
- you can do this outside the git repo to not pollute it
- This creates an empty virtual environment located in PATH/Example directory
- Activate
- Note the (Example) in the begining of the prompt!
- Do note the python version and you may inform users that you know that this version is known to work!
$ which python #should point to the python belonging to the virtual environment
$ python -V # note this version
- You can switch to the directory where you have your code and test to run it
- It will give you errors of missing packages
- Install them with
pip install <package name>
. No need to use ´´--user``, since it will be installed in the virtual environment only. - Do this until your program works
- Check what is installed by:
-
You will probably recognise some of them, but some may be more obscure and were installed automatically as dependencies.
-
Save your requirements as a file that user can run to get the same dependencies as you
- Continue
Demo with planet¶
git switch -c venv
python -m venv venv
venv/Scripts/activate
pip freeze #should be empty
ls
cd code
ls
python planet_main.py
import numpy as np
ModuleNotFoundError: No module named 'numpy'
pip install numpy
python planet_main.py
ModuleNotFoundError: No module named 'matplotlib'
pip install matplotlib
pip freeze
pip freeze > requirements.txt
git add requirements.txt
git commit -m "add requirements.txt"
git push
git switch main
git merge venv
git push
Exercise with project¶
Discuss: what format is suitable for our course project?
Discuss: what are the steps need to make the program complete?
Ignoring files and paths with .gitignore¶
Compiled and generated files are not committed to version control. There are many reasons for this:
- Your code could be run on different platforms.
- These files are automatically generated and thus do not contribute in any meaningful way.
- The number of changes to track per source code change can increase quickly.
- When tracking generated files you could see differences in the code although you haven't touched the code.
For this we use .gitignore
files.
From our project repo
programming_formalisms_project_summer_2024/blob/main/.gitignore>
Key points
Make sure it works for other or you in the future!
Parts to be covered!
- ☑ Source/version control
- Git
- We have a starting point!
- GitHub as remote backup
- branches
- ☑ Planning
- ☑ Analysis
- ☑ Design
- ☑ Testing
- Different levels
- ☑ Collaboration
- GitHub
- pull requests
- ☐ Sharing
- ☑ open science
- ☐ citation
- ☑ licensing
- ☑ deploying
- ☐ Documentation
- ☑ in-code documentation
- ☐ finish documentation