Load and run R

  • find the module to be able to run R

  • load the module to be able to run R

  • run the R interpreter

  • run the R command to get the list of installed R packages

  • run an R script from the command-line

Introduction

To allow us to work with R on an HPC cluster, we will:

  • find the module to be able to run R, so we know which versions of R we can pick from

  • load the module to be able to run R, so we can actually run R

  • run the R interpreter, so we can test/develop R code

  • run an R script from the command-line, so we can run R code

In this session, we will follow this typical user journey.

1. Find an R module

To be able to work with R on an HPC cluster, we will need to find a module that loads a specific version of R.

HPC2N, UPPMAX, LUNARC, and most of the Swedish HPC centres use the same module system:

Here is how to find the modules that load different versions of R:

From a terminal, do:

module spider R

Here is how to find out how to load an R module of a specific version:

To see how to load a specific version of R, including the prerequisites, do

module spider R/<version>

where <version> is an R version, in major.minor.patch format, for example, module spider R/4.1.2.

2. Load an R module

When you have a found a modules to load your favorite version of R, here is how you load that module:

After having done module spider R/4.1.2, you will get a list of which other modules needs to be loaded first, resulting in:

module load GCC/10.2.0 OpenMPI/4.0.5 R/<version>

where <version> is an R version, in major.minor.patch format, for example, module load GCC/11.2.0 OpenMPI/4.1.1 R/4.1.2

If you care about reproducibility of your programming environments and R scripts, you should always load a specific version of a module.

3. Use the R interpreter

Now you have loaded a module for a specific version of R, from the terminal, we can use the R interpreter.

Here we show:

  • how to start the interpreter

  • how to do a trivial R thing

  • how to see the list of installed R packages

  • how to load an R package

  • how to quit the interpreter

3.1. Start the R interpreter

Now you have loaded a module for a specific version of R, from the terminal, we can start the R interpreter like this:

R

3.2 how to do a trivial R thing

Warning

Only do lightweight things!

We are still on the login node, which is shared with many other users. This means, that if we do heavy calculations, all these other users are affected.

If you need to do heavy calculations:

  • Submit that calculation as a batch job

  • UPPMAX only: use an interactive session

This will be shown in the course in a later session

Within the R interpreter we can give R commands:

print("Hello world")

Which will give the output:

[1] "Hello world"

3.3. how to see the list of installed R packages

From within the R interpreter, we can check which packages are installed using:

installed.packages()

3.4. how to load an R package

From within the R interpreter, we can load a package like:

library(ggplot2)

3.5. how to quit the interpreter

To quit the R interpreter, use the quit function:

quit()

You will get the question:

Save workspace image? [y/n/c]:

where you type n until you know what that is :-)

4. Run an R script

Now you have loaded a module for a specific version of R, from the terminal, we can run an R script like this:

Rscript <r_script_name>

where <r_script_name> is the path to an R script, for example Rscript hello.R.

Warning

Only do lightweight things!

We are still on the login node, which is shared with many other users. This means, that if we do heavy calculations, all these other users are affected.

If you need to do heavy calculations:

  • Submit that calculation as a batch job

  • UPPMAX only: use an interactive session

This will be shown in the course in a later session

Exercises

Exercise 1: find an R module

Note

Learning objectives

  • find the module to be able to run R

Use the module system to find which versions of R are provided by your cluster’s module system.

Exercise 2: load an R module

Note

Learning objectives

  • load the module to be able to run R

For this course, we recommend these versions of R:

HPC center

R version

HPC2N

4.1.2

LUNARC

4.2.1

UPPMAX

4.1.1

Load the module for the R version recommended to use in this course.

Exercise 3: use the R interpreter

Note

Learning objectives

  • run the R interpreter

  • run the R command to get the list of installed R packages

Here we:

  • start the R interpreter

  • find out which packages are already installed

  • load an R package

Exercise 3.1: start the R interpreter

Start the R interpreter.

Exercise 3.2: check which packages are installed

From within the R interpreter, check which packages are installed.

Exercise 3.3: load a package

From within the R interpreter, load the parallel package.

Exercise 4: run an R script

Note

Learning objectives

  • run an R script from the command-line

In this exercise, we will run an example script.

Exercise 4.1: get an R script

Get the R script hello.R by downloading it from the terminal:

wget https://raw.githubusercontent.com/UPPMAX/R-python-julia-HPC/main/exercises/r/hello.R

Exercise 4.2: run

Run the R script called hello.R, using Rscript.

Exercise 5: download and extract the tarbal with exercises

See here how to download and extract the tarbal with exercises.

Conclusions

Keypoints

One needs to:

  • first find a module to run R

  • load one or more modules to run R.

  • if one cares about reproducibility, use explicit versions of modules

  • start the R interpreter with R

  • run R scripts scripts with Rscript

However:

  • as we work on a login node, we can only do lightweight things

  • we can only use the R packages installed with the R module

  • we do not work in an isolated environment

These will be discussed in other sessions.