Skip to content

DL Exercises

Info

We put some exercises here for you, if you want to get some more hands-on.

Prepare your project folder

Make arrangements for the new project
  • Find your way into your project uppmax2024-2-21 by logging in to Rackham by ThinLinc/ssh/VSCode.
  • Go to private folder and make an empty folder with your name

    Answer

    ssh jayan@rackham.uppmax.uu.se
    ssh -X jayan@rackham.uppmax.uu.se mkdir

Transfering files

Copy files between to your private folder
  • Use scp to copy a file from the your local laptop to your folder on uppmax2024-2-21. Download CIFAR-10 python pickeled dataset here
  • Do the same activity but with Filezilla or WinSCP. Delete your ealier uploaded data to make space for the new incoming one.

    Answer

    Refer to SCP documentation here

Using the compute nodes

Submit a Slurm job
Answer
  • edit a file using you prefered editor, named my_bio_worksflow.sh, for example, with the content
    #!/bin/bash -l

    #SBATCH -A uppmax2024-2-21
    #SBATCH -p node
    #SBATCH -N 1
    #SBATCH -t 01:00:00
    #SBATCH -J cifar_demo
    #SBATCH -M snowy
    #SBATCH --gres=gpu:1

    module load python_ML_packages/3.9.5-gpu

    python -c "import torch; print(torch.__version__); print(torch.version.cuda); print(torch.cuda.get_device_properties(0)); print(torch.randn(1).cuda())"

    #for model in resnet20 resnet32 resnet44 resnet56 resnet110 resnet1202
    for model in resnet20 resnet110
    do
        echo "python -u trainer.py  --arch=$model  --save-dir=save_$model |& tee -a log_$model"
        python -u trainer.py  --arch=$model  --save-dir=save_$model |& tee -a log_$model
    done
  • make the job script executable

    $ chmoad a+x run.sh
    

  • submit the job

    $ sbatch run.sh
    

Doing installations

Conda installation

Install with Conda directly on Rackham
  • Install python>3.11, transformers, torch, torchvision, notebook (using pip), pytorch-cuda=12.4, ipython, pillow