Skip to content

File transfer using scp

Learning outcomes

  • Practice using the documentation of your favorite HPC cluster
  • Can transfer files using scp
  • Can compress and archive files before transferring
For teachers

Teaching goals are:

  • Learners have practiced using the documentation of their favorite HPC cluster
  • Learners have transferred files using scp

Lesson plan:

gantt
  title File transfer using scp
  dateFormat X
  axisFormat %s
  section First hour
  Prior : prior, 0, 2s
  Present: present, after prior, 3s
  Challenge: crit, challenge, after present, 10s
  Feedback: feedback, after challenge, 2s

Prior questions:

  • Who has heard of scp?
  • Who has already used scp?

Overview other terminal transfer tools

Other than rsync we have the more simple tools scp and sftp.

  • scp has the similar arguments as for the linux copy function scp.
  • sftp is more versatile with more file management capabilities.

  • Both are considered secure.

But what is wget and curl?
  • These tools are used to download files from websites or ftp servers
  • wget saves downloaded contents to local files, like

    • wget ftp://ftp.sunet.se/mirror/archive/ftp.sunet.se/pub/pictures/space/*
    • wget https://upload.wikimedia.org/wikipedia/commons/3/37/Grace_Hopper_and_UNIVAC.jpg -O grace_hopper.jpg
    • typical usecases:
      • download data from a service
      • download a program or compressed source code
    • supports HTTP, HTTPS, and FTP
    • user-friendly for basic tasks
    • good for mirroring websites
      • downloading entire directories recursively for offline viewing or backups.
    • most popular from Unix-based systems, like Linux
  • curl outputs the content to the terminal by default.

    • add -O to download as file.
    • supports a wide range of protocols: HTTP, HTTPS, FTP, FTPS, SCP, SFTP, TFTP...
    • often preferred for scripting and automation due to its versatility
    • interacting with APIs, handling complex web requests
    • often available by default on Windows and MacOS.

SCP is an abbreviation for Secure Copy Protocol

Pros

  • Simple
  • One-line command
  • Secure
  • Use cases
    • copy just a file
    • copy just a specific directory (with sub-directories).

When not to use

  • When needing several one-line commands
    • requires credentials every time
  • When looking to do more than a basic file transfer, SCP falls short.
  • When you on the fly need to create or list directories or delete files.
  • A file with the same name in the same directory is transferred, will be overwritten.
  • Transfers that are interrupted you have to restart the entire transfer.

Procedure

  • In the terminal, copy files using scp to connect to Rackham:
scp [from] [to]

Where [from] is the file(s) you want to copy, and [to] is the destination.

Copy a file from your local computer to the cluster:

scp [local_filename] [username]@<cluster adress>:<path-to-folder>/

Copy a file from the cluster to your local computer, do the command above in reverse order:

scp [username]@<cluster adress>:<path-to-folder>/[remote_filename] [local_folder or "." for _present_ folder]
Example for Rackham

This is how you copy a file from your local computer directly to your HOME folder at Rackham:

scp [local_filename] [username]@rackham.uppmax.uu.se:~/

where [local_filename] is the path to a local filename, and [username] is your UPPMAX username, for example:

scp my_file.txt sven@rackham.uppmax.uu.se:/home/sven

To copy a file from Rackham to your local computer, do the command above in reverse order:

scp [username]@rackham.uppmax.uu.se:/home/[username]/[remote_filename] [local_folder]

where [remote_filename] is the path to a remote filename, [username] is your UPPMAX username, and [local_folder] is your local folder, for example:

scp sven@rackham.uppmax.uu.se:/home/sven/my_remote_file.txt /home/sven

If asked, give your center's password. You can get rid of this prompt if you have setup SSH keys

Cheat sheet for scp
  • scp followed by none or any of the following option flags and the files and servers involved

  • scp provides a number of options that control every aspect of its behavior. The most widely used options are:

  • -P - Specifies the remote host ssh port.

  • -p - Preserves file modification and access times.
  • -q - Use this option if you want to suppress the progress meter and non-error messages.
  • -C - This option forces scp to compress the data as it is sent to the destination machine.
  • -r - This option tells scp to copy directories recursively.

Warning

  • There is no warning if a file is about to be overwritten.
    • There is no scp -i ... as for rm -i that asks if you really want to remove the file.
  • 'rsync' may be a better tool if you want to sync existing content.

Large or many files

  • Shorten download/upload time by reducing the size of the file!
    • A common tool in Linux environments is gzip.
    • Usage: gzip <filename>. You'll get a gzfile ending
  • Transferring many files will create so called overhead
    • each file has to be addressed individually.
  • Solution is to gather the files in an archive, like tar.
    • A folder with content then behaves like ONE file.
    • Usage: tar -cf archive.tar /path/files or tar -cf archive.tar /path/folder
  • While TARing you may compress the data as well!
    • tar -czf archive.tar.gz /path/files
Extract/inflate
  • gunzip compressed_file.gz
  • tar -xf archive.tar
  • tar -xzf compressed_archive.tar.gz
gzip cheat sheet
- [``gzip`` manual](https://www.gnu.org/software/gzip/manual/gzip.html#Sample)
tar cheat sheet
- [``tar`` manual](https://devhints.io/tar)
Can I use archiving and compressing in all transfer methods?
  • Yes!

Exercises

You may want to prioritize next session instead!

Exercise 0: Use the documentation of your HPC cluster
  • Search for how to transfer files to/from your HPC cluster using scp. At which URL is it described?
    • Tip: not all HPC centers have documented this, so you should give up searching after a while.
Where is that documentation?
HPC Cluster Link to documentation
Alvis here.
Berzelius here
Bianca Available for download via the transit server, see here
COSMOS FIX? None. Closest is here.
Dardel here
Kebnekaise here
LUMI To do
Rackham here
Tetralith here
Vera FIX? None. Closest is here.
Exercise 1: Upload a file from your computer, using scp

Tips

  • (If you want to create a file in local terminal: $ touch local_file)
  • (You can check the file structure in an ssh session)
  • Send it to an existing folder (e.g. transfer) on Tetralith
  • Check on server that it is there
Answer (Tetralith example)

Locally

  • (If you want to create a file in local terminal: $ touch local_file)
  • Send it to an existing folder (e.g. transfer) on Tetralith: $ scp local_file [username]@tetralith.nsc.liu.se:~/transfer/

Check on server that it is there

Exercise 2: Download a file from your the server to you computer, using scp

Tips

  • (If you want to create a file in remote ssh terminal: $ touch remote_file)
  • Send it to an existing local folder
  • Check locally that it is there
Answer (Tetralith example)

On Server

  • (If you want to create a remote file first, in an SSH session, do: $ touch remote_file)
  • Get it to an existing local folder (e.g. transfer): $ scp [username]@tetralith.nsc.liu.se:~/transfer/remote_file .

Check locally that it is there

(Optional) Exercise 3: Download a directory with many files

Tips

  • Create 1000 files REMOTELY in a directory with name many_files
    • $ mkdir many_files
    • $ cd many_files
    • $ touch my-file-{1..1000}.txt
  • Time the download of the directory, using time, and the recursive option to include the files within the directory
    • time scp ....
Answer (Tetralith example)
(Optional) Exercise 4: Download the Compressed directory

Tips

  • Archive and zip the many_files directory
  • Time the download of the compressed directory, using time.
    • If time does not work, count the seconds!
Answer (Tetralith example)

Archiving and compressions step on REMOTE

  • tar -cvzf many_files.tar.gz many_files

LOCALLY