Introduction R

../_images/r_logo_50.png
  • see a first overview of the R programming language

  • see the overview of the course

  • hear about ‘the tarbal with exercises’

Course learning objectives

  • use the module system to load R

  • use the module system to load site-installed R packages

  • find out which versions of R and packages are installed

  • run R scripts

  • write a batch script for running R

  • install R packages from CRAN

  • see how to install other R packages yourself

  • start batch jobs

  • run RStudio

on HPC2N, UPPMAX, or LUNARC

Course non-goals

  • improve R coding skills

  • use R on other HPC clusters

First overview of R

R is a programming language for statistical computing and data visualization (from Wikipedia).

flowchart TD

    subgraph r[R]
      r_interpreter[the R interpreter]
      r_packages[R packages]
      r_language[the R programming language]
      r_dev[R software development]
      rstudio[RStudio]

      interpreted_language[Interpreted]
      cran[CRAN]
    end

    r_language --> |has| r_dev
    r_language --> |is| interpreted_language 
    r_language --> |uses| r_packages
    interpreted_language --> |done by| r_interpreter
    r_packages --> |maintained by| cran
    r_dev --> |commonly done in| rstudio

The main general R resources are:

R is used in many NAISS centres:

Schedule

flowchart TD

    subgraph login[HPC login]
      ssh[0.SSH]
      remote_desktop_website[5.Remote desktop website]
      remote_desktop_local_thinlinc_client[5.Remote desktop with local ThinLinc client]
    end
    subgraph scheduler[scheduler]
      running_batch_jobs[3.Running batch jobs]
      running_interactive_session[5.Running an interactive session]
    end
  
    login --> |allows for| scheduler flowchart TD

    subgraph r[R]
      r_interpreter[1.the R interpreter]
      r_packages[2.R packages]
      r_virtual_environments[2.R virtual environments]
      r_language[1.the R programming language]
      parallel_and_multithreaded_functions[3.Parallel and multithreaded functions]
      r_dev[5.R software development]
      rstudio[5.RStudio]
      ml[4.Machine learning]
      interpreted_language[1.Interpreted]
      cran[1.CRAN]
    end
    subgraph modules[modules]
      r_module[1.R module]
      r_packages_module[2.R_packages module]
      rstudio_module[5.RStudio module]
    end

  
    r_language --> |has| r_dev
    r_language --> |is| interpreted_language 
    r_language --> |uses| r_packages
    interpreted_language --> |done by| r_interpreter
    r_packages --> |maintained by| cran
    r_packages --> |isolated by|r_virtual_environments 
    r_language --> |allows| parallel_and_multithreaded_functions
    r_language --> |provides for| ml
    r_dev --> |commonly done in| rstudio

    r_interpreter --> |loaded by|r_module
    r_packages --> |loaded by|r_packages_module
    rstudio --> |loaded by|rstudio_module

    rstudio_module --> |automatically loads latest| r_packages_module
    r_packages_module --> |automatically loads corresponding version of| r_module

Time

Topic

Teacher(s)

9:00

(optional) First login

BB + PO + RB

9:45

Break

.

10:00

Introduction

RB

10:10

Syllabus

RB

10:20

Load modules and run

RB

10:45

Break

.

11:00

Packages

BB

11:30

Isolated environments

BB

12:00

Lunch

.

13:00

Batch

BB

13:30

Parallel

PO

14:15

Break

.

14:30

Simultaneous session

.

.

HPC2N: ThinLinc, RStudio

PO

.

LUNARC: On-Demand, RStudio

RP

.

UPPMAX: Interactive, RStudio

RB

15:15

Break

.

15:30

Machine learning

BB or PO

16:00

Summary and evaluation

RB

16:15

Done

.

  • RB: suggest to make ‘Batch’ 15 minutes longer and remove a session, in the next course iteration

Exercises used in the course

The course uses a so-called tarball files with exercises as used in this course.

See here how to get and decompress it.

In the ‘Load modules and run’ session, there is the time to do so.