Transfer tips¶
Learning outcomes
- (Optional) Can compress and archive files before transferring
Large or many files¶
- Shorten download/upload time by reducing the size of a file!
- A common tool in Linux environments is
gzip. - Usage:
gzip <filename>. You'll get agzfile ending
- A common tool in Linux environments is
- Transferring many files will create so called overhead
- each file has to be addressed individually.
- Solution is to gather the files in an archive, like tar.
- A folder with content then behaves like ONE file.
- Usage:
tar -cf archive.tar /path/filesortar -cf archive.tar /path/folder
- While TARing you may compress the data as well!
tar -czf archive.tar.gz [/path/files]
Extract/inflate
gunzip compressed_file.gztar -xf archive.tartar -xzf compressed_archive.tar.gz- the extracted folders will inherit the old name and internal structure
Can I use archiving and compressing in all transfer methods?
- Yes!
Workflow
-
Archive and compress a folder with many large files
tar -czf manylargefiles_folder.tar.gz manylargefiles_folder/ -
Transfer data
- Use FileZilla/scp/rsync/sftp
-
Extract at target destination
tar -xzf manylargefiles_folder.tar.gz -
You should now have
manylargefiles_folder/again at the target destination!
Cheat sheets
Options for compressing during the transfer
scp -C ...rsync --compress ...orrsync -z ...-
sftp -C user@host -
The file(-s) are then decompressed on the destination.
Bug
add info about cpu
Exercises¶
(Optional) Exercise 1: Download a directory with many files
Tips
- Be in the
transferdirectory (or similar) and create 3000 (empty) files REMOTELY in a directory with namemany_files$ mkdir many_files$ cd many_files$ touch my-file-{1..3000}.txt
- Time the download of the directory, using
time, and the recursive option to include the files within the directorytime scp ....
Answer (Tetralith example)
-
time scp -r sm_bcarl@tetralith.nsc.liu.se:~/test/many_files .
(Optional) Exercise 2: Test the difference between transferring one or several files (using scp)
Tips
-
Archive the many_files directory
- The original directory is still there! Check!
-
Time the download of the original directory, using
time scp ....- If
timedoes not work, count the seconds!
- If
-
Time the download of the compressed directory, using
time scp ....- If
timedoes not work, count the seconds!
- If
-
Focus on the
userline, becauserealincludes the time for establishing connection and giving the credentials! - Do you spott any difference?
Answer (Tetralith example)
Archiving and step on REMOTE
tar -cvf many_files.tar many_files- The original directory is still there! Check!
LOCALLY
time scp -r sm_bcarl@tetralith.nsc.liu.se:~/transfer/many_files .- note the-rfor recursive and including files in the folder.-
time scp sm_bcarl@tetralith.nsc.liu.se:~/transfer/many_files.tar .
(Optional): Exercise 3: Test the difference between transferring one or several files (using SFTP)
Tips
In an SSH session (not SFTP) with REMOTE/server
- To not interfer with last exercise make a new folder by creating 3000 files REMOTELY in a directory with name
many_files$ mkdir many_files$ cd many_files_sftp$ touch file_{1..3000}.txt- Check content:
$ lsfor checking - Leave directory to be able to perform next step:
$ cd ..
- Also archive the
many_files_sftpfolder tomany_files_sftp.tar- The original directory is still there! Check!
Establish the SFTP session (Exercise 1 in SFT session)
- Download (to local) the directory and note the time needed (not shown in numbers so count the seconds!)
- Download (to local) the
.tarfile and note the time needed - Was there a significant difference?
Answer (Example with Tetralith)
Archiving and compressions step REMOTELY
tar -cvf many_files_sftp.tar many_files
Establish SFTP connection
$ sftp sm_bcarl@tetralith.nsc.liu.se
Download
> get -r many_files_sftp(we need the recursive command-r)-
> get many_files_sftp.tar.gz