Skip to content

Instantly share code, notes, and snippets.

@dbolser-ebi
Last active September 21, 2018 17:39
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save dbolser-ebi/5adac6ee42b024c340d34aee70fe5cb8 to your computer and use it in GitHub Desktop.
Save dbolser-ebi/5adac6ee42b024c340d34aee70fe5cb8 to your computer and use it in GitHub Desktop.
Generate Compara dumps for Plant Rectome (everything you need, I think)
## Get BioMart
git clone https://github.com/biomart/biomart-perl.git
## Get ready
cd biomart-perl/scripts
## Get Sharon's script from here:
## https://gist.github.com/dbolser-ebi/1890e35cb6d1366544e1346666cfe6f5
wget https://gist.githubusercontent.com/dbolser-ebi/1890e35cb6d1366544e1346666cfe6f5/raw/2b255a65db5b7ea3229cf938d0c6d7980a9b192f/biomart_perl_query_4Justin.pl
## Set up your Ensembl Perl environment
source /homes/dbolser/EG_Places/Devel/lib/libensembl-94/setup.sh
## I think I had to add XML parsing from cpan (using this handy guide:
## http://bioblog5000.blogspot.com/2011/12/this-way-individual-projects-with.html)
source local-lib.sh
## Finally, add BioMart
export PERL5LIB=$PERL5LIB:$PWD/../lib
## NOW EDIT ../conf/martURLLocation.xml !!!
## NOTE, MartService will create this for you!
## e.g. http://ens-prod-1.ebi.ac.uk:10301/biomart/martservice?type=registry
## And we're more or less ready...
ensembl_version=$(perl -MBio::EnsEMBL::ApiVersion -e "print software_version")
eg_version=$(echo $ensembl_version-53 | bc)
echo $eg_version/$ensembl_version
## Note, run clean the first time (e.g. for testing), then cached
perl biomart_perl_query_4Justin.pl dcarota
db_list=~/Plants/Lists/plant_list-$eg_version.txt
while read -r db; do
echo $db
db=$(echo $db | perl -ne '
die unless /^(\w)\w+_(\w+)_core_/; print "$1$2\n"
')
time \
perl ./biomart_perl_query_4Justin.pl \
$db > Output/$db-$eg_version.out
echo
done < \
<(grep _core_ $db_list)
ls Output/*-$eg_version.out | wc -l
tar -cjvf \
compara_orthologues-$eg_version.tar.bz \
Output/*-$eg_version.out
cp compara_orthologues-$eg_version.tar.bz \
/nfs/panda/ensemblgenomes/ftp/pub/misc_data/gramene/
mail preecej@science.oregonstate.edu -S test < \
<( echo ftp://ftp.ensemblgenomes.org/pub/misc_data/gramene/compara_orthologues-41.tar.bz )
@weix-cshl
Copy link

Nice!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment