Skip to content

Instantly share code, notes, and snippets.

@stevekm
Forked from brentp/gist:819611
Created February 3, 2017 01:30
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save stevekm/c29344cd1b66b422b907e2f4cdf41756 to your computer and use it in GitHub Desktop.
Save stevekm/c29344cd1b66b422b907e2f4cdf41756 to your computer and use it in GitHub Desktop.
install annovar and use it to annotate a vcf with hg19
wget http://www.openbioinformatics.org/annovar/download/annovar.latest.tar.gz.mirror
tar xzvf annovar.tar.gz
cd annovar
# download databases (goes to UCSC)
./annotate_variation.pl -buildver hg19 -downdb 1000g2010nov humandb
./annotate_variation.pl -buildver hg19 -downdb avsift humandb
./annotate_variation.pl -buildver hg19 -downdb refGene humandb
./annotate_variation.pl -buildver hg19 -downdb mce46way humandb/
./annotate_variation.pl -buildver hg19 -downdb snp131 humandb/
./annotate_variation.pl -buildver hg19 -downdb segdup humandb/
# more extensive gene db. see:
# http://www.openbioinformatics.org/annovar/annovar_gene.html
# (Switching to UCSC Known Gene annotation or Ensembl Gene annotation)
./annotate_variation.pl --buildver hg19 --downdb knowngene humandb
./annotate_variation.pl --buildver hg19 --downdb ensgene humandb
# known variation
./annotate_variation.pl --buildver hg19 --downdb dgv humandb/
./annotate_variation.pl --buildver hg19 --downdb gwascatalog humandb/
# not available yet:
# perl ./annotate_variation.pl -buildver hg19 -downdb tfbs humandb/
./convert2annovar.pl -format vcf4 $IN > ${IN}.annovar
# see: http://www.openbioinformatics.org/annovar/annovar_faq.html#1000g_hg19
./annotate_variation.pl --filter --buildver hg19 --dbtype 1000g2010nov_all ${IN}.annovar humandb/
# iterative filtering the suffix hg19_ALL.sites... gets added. _filtered are the ones
# that were not in the 1000genomes snps.
./annotate_variation.pl --filter --buildver hg19 --dbtype snp131 ${IN}.annovar.hg19_ALL.sites.2010_11_filtered humandb/
# now ${IN}.annovar.hg19_ALL.sites.2010_11_filtered.hg19_snp131_filtered
# contains the stuff not in dbsnp131 or in 1000 genomes.
# http://www.openbioinformatics.org/annovar/annovar_faq.html#vcf
# maybe can use this: http://biostar.stackexchange.com/questions/3432/1000g-and-dbsnp-build-132-in-ucsc-genome-browser/3436#3436
# for dbsnp 132
# TODO: http://www.openbioinformatics.org/annovar/annovar_faq.html#iupac
# IUPAC calls are excluded as bad input. convert these first
FILTERED=${IN}.annovar.hg19_ALL.sites.2010_11_filtered.hg19_snp131_filtered
# annotate remaining snps by proximity to gene.
./annotate_variation.pl --buildver hg19 --geneanno $FILTERED humandb/
# now ${FILTERED}.exonic_variant_function lists snps in genes
# and ${FILTERED}.variant_function annotates up/down stream genes and distances and if in UTR/intron
# generate a big spreadsheet of the annotations.
./auto_annovar.pl --buildver hg19 --ver1000g 1000g2010nov --verdbsnp 131 --genetype knowngene --outfile lung_auto LTRC_274462_lung_unique.annovar humandb
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment