Skip to content

Instantly share code, notes, and snippets.

@nickloman
Created October 29, 2012 15:14
Show Gist options
  • Star 4 You must be signed in to star a gist
  • Fork 2 You must be signed in to fork a gist
  • Save nickloman/3974107 to your computer and use it in GitHub Desktop.
Save nickloman/3974107 to your computer and use it in GitHub Desktop.
Retrieve all Escherichia sequences from Genbank with rsync
rsync -av rsync://ftp.ncbi.nlm.nih.gov/genomes/Bacteria --include "*/" --include "Bacteria/Escherichia*/*.fna" --exclude=* .
# bonus script - concatenate chromosomes and plasmids into single fasta file, make sure the files don't already exist
find . -mindepth 1 -type d | xargs -L 1 -I '{}' find {} -name "*.fna" | while read i ; do cat "$i" >> `dirname "$i"`.fasta ; done
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment