Parallel computation¶
Learning outcomes
- Schedule and run a job that needs more cores, with a calculation in their favorite language
- Understand when it is possible/impossible and/or useful/useless to run a job with multiple cores
For teachers
Teaching goals are:
- Learners have scheduled and run a job that needs more cores, with a calculation in their favorite language
- Learners understand when it is possible/impossible and/or useful/useless to run a job with multiple cores
Prior:
- What is parallel computing?
Feedback:
- When to use parallel computing?
- When not to use parallel computing?
Why parallel computing is important¶
Most HPC clusters use 10 days as a maximum duration for a job. Your calculation may take longer than that. One technique that may work is to use parallel computing, where one uses multiple CPU cores to work together on a same calculation
Types of ‘doing more things at the same time’¶
Type of parallelism | Number of cores | Number of nodes | Memory | Library |
---|---|---|---|---|
Single-threaded | 1 | 1 | As given by operating system | None |
Threaded/shared memory | Multiple | 1 | Shared by all cores | OpenMP |
Distributed | Multiple | Multiple | Distributed | OpenMPI |
-
Threaded parallelism: calculations that can use multiple cores with a shared memory.
-
Distributed programming. Uses a Message Passing Interface. For a job that use many different nodes, for example, a weather prediction.
- Slurm job arrays: for running jobs that are embarassingly parallel, for example, running a simulation with different random numbers Not in this session
When to use parallel computing¶
- Be aware of Amdahl’s law and/or Gustafson’s law
- Single-threaded programs will never work
Output¶
Remember¶
- Use
--ntasks=N
- Use
srun
- Use an MPI version of your software: a ‘regular’ non-MPI version will never work!
Links¶
Julia stuff here
MATLAB stuff here
R stuff here