Last active
September 30, 2022 21:14
-
-
Save steindev/e8f6b52ef163b112d8c8dfccfe956fc8 to your computer and use it in GitHub Desktop.
Script to transfer data from Juelich Supercomputing Center to HZDR.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
#!/bin/bash | |
# | |
# Script to transfer data from Juelich Supercomputing Center to HZDR. | |
# Call with | |
# > screen # open a screen session first to be able to logout from the | |
# # data mover system | |
# > exec ssh-agent bash # prepare shell to add ssh key passphrase to ssh-agent | |
# # in order to not type it all the time | |
# > ssh-add ~/.ssh/id_ed25519 # add ssh key passphrase to ssh-agent | |
# > xargs -a dirs.list -n 1 -P 5 ~/bin/data-transfer_judac.sh | tee transfer.out | |
# which will transfer 5 directories named in file dirs.list at a time. | |
# Pepare dirs.list, keeping the list of directories that need to | |
# be transfered, by running on judac in the directory with the directories to transfer | |
# > find . -maxdepth 1 -name "your_directory_naming_pattern" -print | |
# | |
# In ~/.ssh/config I have a definition for Host judac providing the HostName judac.fz-juelich.de, | |
# User, IdentityFile, and setting ForwardAgent No, AddKeysToAgent No. | |
# | |
# Note that rsync always verifies that each transferred file was correctly reconstructed on the | |
# receiving side by checking a whole-file checksum that is generated as the file is transferred. | |
# | |
# Nevertheless, successful file transfer can be checked manually by creating md5 checksums of all | |
# files in the source and comparing to the respective checksums on the target system. | |
# In the source directory, create the file checksums.md by | |
# > find . -type f -exec md5sum {} + > checksums.md5 | |
# and compare with the transfered directory on the target system by | |
# > md5sum --check checksums.md5 2>check.err >check.out | |
# where checksums.md5 can be transferred via rsync. | |
# Then check if everything is as expected by greping the check.out file for failed files | |
# > grep -i 'failed' check.out | |
# | |
# Better parallelize creation and checking of md5sums directory-wise with xargs the same | |
# as the data transfer, i.e. rsync below. | |
# | |
# CC0 Klaus Steiniger, 2021-2022 | |
# Define source directory on JSC file system | |
SOURCE="/p/scratch/project/parent/directory/of/folders/to/copy" | |
# Define target directory on HZDR file system | |
TARGET="/net/gssnfs/bigdata/project/parent/directory/where/folders/are/saved" | |
printf "Transfer of %s started at %s\n" ${1##*/} $(date +%F_%H%M%S) | |
rsync --stats -avzhPe 'ssh' judac:$SOURCE/${1} $TARGET/ 2>$TARGET/transfer_${1##*/}.err > $TARGET/transfer_${1##*/}.out | |
printf "Transfer of %s finished at %s\n" ${1##*/} $(date +%F_%H%M%S) |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment