Skip to content

Instantly share code, notes, and snippets.

@mchelen
Created June 15, 2010 16:54
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save mchelen/439364 to your computer and use it in GitHub Desktop.
Save mchelen/439364 to your computer and use it in GitHub Desktop.
#!/bin/sh
# author: Michael Chelen http://mikechelen.com http://twitter.com/mikechelen
# license: Creative Commons Zero Public Domain http://creativecommons.org/publicdomain/zero/1.0/
# requires curlftpfs
# mirrors the XML files from Pubmed Central Open Access Subset FTP
# tested with Ubuntu Server 10.04
# ftp server and remote path
ftp=ftp.ncbi.nlm.nih.gov
ftppath=pub/pmc
# output path
outputpath=output
# create mount point
mkdir -p $ftp
# mount ftp with curlftpfs
echo Mounting $ftp
curlftpfs $ftp $ftp
# create output path
mkdir -p $outputpath
# extract files
for I in A-B C-H I-N O-Z;
do
currentfile=$ftp/$ftppath/articles.$I.tar.gz
echo Extracting $currentfile to $outputpath
tar --totals -xvzkf $currentfile -C $outputpath
echo Done with $currentfile
done
# unmount and remove directory
fusermount -uz $ftp
rmdir $ftp
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment