Exercises¶
Info
We put some exercises here for you, if you want to get some more hands-on.
Working with modules¶
View in IGV
-
Load the genome, the bam file, and the annotated vcf that we got from the demo into IGV for viewing
Transferring files¶
Copy files between to Sens projects
-
Use the Transit server to copy a file (e.g. the interactive session script) from the Bianca workshop project to another project, if you belong to one.
Answer
- Connect to transit
- Mount the projects with mount_wharf
- Move/copy the file(s) from sens2023598 to your other project
- Log in to the other Sens project on Bianca and move the file from the wharf to a good place
Using the compute nodes¶
Submit a Slurm job
- Make a batch job to run the demo "Hands on: Processing a BAM file to a VCF using GATK, and annotating the variants with snpEff". Ask for 2 cores for 1h.
Answer
- edit a file using you preferred editor, named
my_bio_worksflow.sh
, for example, with the content
#!/bin/bash
#SBATCH -A sens2023598
#SBATCH -J workflow
#SBATCH -t 01:00:00
#SBATCH -p core
#SBATCH -n 2
cd /proj/sens2023598/workshop/slurm/
module load bioinfo-tools
# load samtools
module load samtools/1.17
# copy and example BAM file****
cp -a /proj/sens2023598/workshop/data/ERR1252289.subset.bam .
# index the BAM file
samtools index ERR1252289.subset.bam
# load the GATK module
module load GATK/4.3.0.0
# make symbolic links to the hg38 genomes
ln -s /sw/data/iGenomes/Homo_sapiens/UCSC/hg38/Sequence/WholeGenomeFasta/genome.* .
# create a VCF containing inferred variants
gatk HaplotypeCaller --reference genome.fa --input ERR1252289.subset.bam --intervals chr1:100300000-100800000 --output ERR1252289.subset.vcf
# use snpEFF to annotate variants
module load snpEff/5.1
java -jar $SNPEFF_ROOT/snpEff.jar eff hg38 ERR1252289.subset.vcf > ERR1252289.subset.snpEff.vcf
# compress the annotated VCF and index it
bgzip ERR1252289.subset.snpEff.vcf
tabix -p vcf ERR1252289.subset.snpEff.vcf.gz
- make the job script executable
- submit the job
Doing installations¶
Rpackage installation¶
Install dowser
- Install the package
dowser
on Rackham and use sftp to get it to Bianca. - Dowser on ReadTheDocs
Answer
Conda installation¶
Install with Conda directly on Bianca
- Install
python=3.7
andnumpy=1.15
with Conda directly on Bianca.
Pip installation with virtual environment¶
Install with pip
- Make a virtual environment (confer this tutorial) with
python/3.8.7
on Rackham and installnumpy==1.18.1
andmatplotlib==3.1.3
. Usesftp
to get it to Bianca.
Julia installation¶
Install a Julia package
- Install
Gumbo
in the julia packager tutorial. Usesftp
to get it to Bianca.
Singularity/Apptainer¶
Install gatk on bianca with Apptainer
- Use the docker image for
gatk/4.3.0.0
and install on Rackham and transfer to Bianca.