Introduction R

../_images/r_logo_50.png
  • see a first overview of the R programming language

  • see the overview of the course

Course learning objectives

  • use the module system to load R

  • use the module system to load site-installed R packages

  • find out which versions of R and packages are installed

  • run R scripts

  • write a batch script for running R

  • install R packages from CRAN

  • see how to install other R packages yourself

  • start batch jobs

  • run RStudio

on HPC2N or UPPMAX

Course non-goals

  • improve R coding skills

  • use R on other HPC clusters

First overview of R

R is a programming language for statistical computing and data visualization (from Wikipedia).

flowchart TD

    subgraph r[R]
      r_interpreter[the R interpreter]
      r_packages[R packages]
      r_language[the R programming language]
      r_dev[R software development]
      rstudio[RStudio]

      interpreted_language[Interpreted]
      cran[CRAN]
    end

    r_language --> |has| r_dev
    r_language --> |is| interpreted_language 
    r_language --> |uses| r_packages
    interpreted_language --> |done by| r_interpreter
    r_packages --> |maintained by| cran
    r_dev --> |commonly done in| rstudio

The main general R resources are:

R is used in many NAISS centres:

R Exercise files

  • On HPC2N, you can copy the R exercise tarball from /proj/nobackup/hpc2n2024-025/exercises-r.tar.gz

  • On UPPMAX, you can copy the R exercise tarball from /proj/naiss2024-22-107/exercises-r.tar.gz

Preliminary schedule

flowchart TD

    subgraph login[HPC login]
      ssh[0. SSH]
      remote_desktop_website[5. Remote desktop website]
      remote_desktop_local_thinlinc_client[5. Remote desktop with local ThinLinc client]
    end
    subgraph scheduler[scheduler]
      running_batch_jobs[3. Running batch jobs]
      running_interactive_session[5. Running an interactive session]
    end
  
    login --> |allows for| scheduler flowchart TD

    subgraph r[R]
      r_interpreter[1. the R interpreter]
      r_packages[2. R packages]
      r_virtual_environments[2. R virtual environments]
      r_language[1. the R programming language]
      parallel_and_multithreaded_functions[3. Parallel and multithreaded functions]
      r_dev[5. R software development]
      rstudio[5. RStudio]
      ml[4. Machine learning]
      interpreted_language[1. Interpreted]
      cran[1. CRAN]
    end
    subgraph modules[modules]
      r_module[1. R module]
      r_packages_module[2. R_packages module]
      rstudio_module[5. RStudio module]
    end

  
    r_language --> |has| r_dev
    r_language --> |is| interpreted_language 
    r_language --> |uses| r_packages
    interpreted_language --> |done by| r_interpreter
    r_packages --> |maintained by| cran
    r_packages --> |isolated by|r_virtual_environments 
    r_language --> |allows| parallel_and_multithreaded_functions
    r_language --> |provides for| ml
    r_dev --> |commonly done in| rstudio

    r_interpreter --> |loaded by|r_module
    r_packages --> |loaded by|r_packages_module
    rstudio --> |loaded by|rstudio_module

    rstudio_module --> |automatically loads latest| r_packages_module
    r_packages_module --> |automatically loads corresponding version of| r_module
Preliminary times

Time

Topic

Activity

9:00

Syllabus

10m

9:10

Introduction, R in general

Lecture 10 m

9:20

Loading modules and running R codes

Lecture+code along 25m

9:45

Coffee break

10:00

Packages

Lecture+code along 30m

10.30

Isolated environments

Lecture+code along 20m

10:50

break

11:00

SLURM Batch scripts for R jobs

Lecture+code along + exercise 30m

11:30

Parallel and multithreaded functions

Lecture+code along 35m

12:00

LUNCH

13.00

Exercises and informal chat (or break)

13.15

ML

Lecture+code along 35m

13:50

break

14.00

Parallel session - HPC2N: ThinLinc & RStudio

Lecture+code along 25m

Parallel session - UPPMAX: Interactive/ThinLinc & RStudio

Lecture+code along 25m

14.25

Summary

14.35

Evaluation

14.45

Q&A on-demand

15:00

END