Skip to content

Instantly share code, notes, and snippets.

@prehensilecode
Last active March 14, 2022 17:44
Show Gist options
  • Save prehensilecode/cb7bd5d5d70e39f64ec166e3c5a258ee to your computer and use it in GitHub Desktop.
Save prehensilecode/cb7bd5d5d70e39f64ec166e3c5a258ee to your computer and use it in GitHub Desktop.
Updates and unpacks NCBI BLAST dbs
#!/bin/bash
module use /ifs/opt/modulefiles
module load python/gcc/3.9.1
module load ncbi-blast
### About the db sources:
### - ncbi provides .tar.gz files, which need to be extracted
### - gcp and aws provide all the db files directly
logfile=/var/log/blast.log
date -u -Iseconds >> $logfile
export dbsource="ncbi"
if [[ $dbsource = "ncbi" ]]
then
export nthreads=1
else
export nthreads=4
fi
echo "Updating BLAST dbs from ${dbsource}; no. of threads = ${nthreads}" >> $logfile 2>&1
curl -s https://ftp.ncbi.nlm.nih.gov/blast/db/v5/README --output README
curl -s https://ftp.ncbi.nlm.nih.gov/blast/db/v5/blastdbv5.pdf --output blastdbv5.pdf
curl -s https://ftp.ncbi.nlm.nih.gov/blast/db/v5/blastdb-manifest.json --output blastdb-manifest.json
for db in $( update_blastdb.pl --showall --quiet )
do
echo "Updating $db ..." >> $logfile 2>&1
update_blastdb.pl --num_threads $nthreads --source $dbsource $db >> $logfile 2>&1
retval=$?
while [ $retval -ne 0 ]
do
sleep 120
update_blastdb.pl --num_threads $nthreads --source $dbsource $db >> $logfile 2>&1
retval=$?
done
done
if [[ $dbsource = "ncbi" ]]
then
for tarball in *.tar.gz
do
echo "Extracting ${tarball} …" >> $logfile 2>&1
tar --no-same-owner -xvf $tarball >> $logfile 2>&1
done
fi
chgrp urcfadm *
echo "$(date -u -Iseconds) - Done" >> $logfile
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment