Skip to content

Instantly share code, notes, and snippets.

@bschilder
Last active November 23, 2023 03:08
Show Gist options
  • Save bschilder/8a64d266e0e3ab18075274ad539985ac to your computer and use it in GitHub Desktop.
Save bschilder/8a64d266e0e3ab18075274ad539985ac to your computer and use it in GitHub Desktop.
install_vep

The Variant Effect Predictor (VEP) is a super useful tool for extracting many kinds of variant-level annotations from Ensembl. Unfortunately, they make it extremely difficult to install. Here are my attempts to install it via every method I could find (none of them worked...).

See here for more help on the VEP GitHub repo.

Method 1: no environment

On Mac:

brew install cpanminus htslib xz

Or on Linux:

sudo apt-get -y install cpanminus htslib xz

2. Install perl deps

cpanm --sudo DBI
cpanm --sudo DBD::mysql
cpanm --sudo Set::IntervalTree
cpanm --sudo JSON
cpanm --sudo PerlIO::gzip
cpanm --sudo Bio::DB::BigFile 
cpanm --sudo Bio::DB::HTS
cpanm --sudo Try::Tiny

3. Setup htslib path

export DYLD_LIBRARY_PATH=$DYLD_LIBRARY_PATH:/Users/$USER/Desktop/ensembl-vep/htslib
git clone https://github.com/Ensembl/ensembl-vep.git
cd ensembl-vep
perl INSTALL.pl

Or can get the source code via the Download (zip) option (but runs into the exact same errors as the GitHub-distribution).

Error using default method:

bash-3.2$ perl INSTALL.pl 

WARNING: DBD::mysql module not found. VEP can only run in offline (--offline) mode without DBD::mysql installed

http://www.ensembl.org/info/docs/tools/vep/script/vep_download.html#requirements
Installation on OSX requires that you set up some paths before running this installer.
Have you 
1. added /Users/bms20/Desktop/ensembl-vep/htslib to your DYLD_LIBRARY_PATH environment variable?
(y/n): y
Hello! This installer will help you set up VEP v110, including:
 - Install v110 of the Ensembl API for use by the VEP. It will not affect any existing installations of the Ensembl API that you may have.
 - Download and install cache files from Ensembl's FTP server.
 - Download FASTA files from Ensembl's FTP server.
 - Download VEP plugins.

Checking for installed versions of the Ensembl API...done

Setting up directories
Destination directory ./Bio already exists.
Do you want to overwrite it (if updating VEP this is probably OK) (y/n)? y
 - fetching BioPerl
 - unpacking ./Bio/tmp/release-1-6-924.zip
 - moving files
Attempting to install Bio::DB::HTS and htslib.

>>> If this fails, try re-running with --NO_HTSLIB

 - checking out HTSLib
fatal: destination path 'htslib' already exists and is not an empty directory.
 - building HTSLIB in ./htslib
In /Users/bms20/Desktop/ensembl-vep/htslib
gcc -g -Wall -O2 -fPIC -Wno-unused -Wno-unused-result -I.  -c -o cram/cram_io.o cram/cram_io.c
cram/cram_io.c:61:10: fatal error: lzma.h: No such file or directory
 #include <lzma.h>
          ^~~~~~~~
compilation terminated.
make: *** [cram/cram_io.o] Error 1
Compile didn't complete. No libhts.a library file found at INSTALL.pl line 925.

Error using --NO_HTSLIB flag:

bash-3.2$ perl INSTALL.pl --NO_HTSLIB
WARNING: DBD::mysql module not found. VEP can only run in offline (--offline) mode without DBD::mysql installed

http://www.ensembl.org/info/docs/tools/vep/script/vep_download.html#requirements
Hello! This installer will help you set up VEP v110, including:
 - Install v110 of the Ensembl API for use by the VEP. It will not affect any existing installations of the Ensembl API that you may have.
 - Download and install cache files from Ensembl's FTP server.
 - Download FASTA files from Ensembl's FTP server.
 - Download VEP plugins.

Checking for installed versions of the Ensembl API...done

Setting up directories
Destination directory ./Bio already exists.
Do you want to overwrite it (if updating VEP this is probably OK) (y/n)? y
 - fetching BioPerl
 - unpacking ./Bio/tmp/release-1-6-924.zip
 - moving files

Downloading required Ensembl API files
 - fetching ensembl
 - unpacking ./Bio/tmp/ensembl.zip
 - moving files
 - getting version information
 - fetching ensembl-variation
 - unpacking ./Bio/tmp/ensembl-variation.zip
 - moving files
 - getting version information
 - fetching ensembl-funcgen
 - unpacking ./Bio/tmp/ensembl-funcgen.zip
 - moving files
 - getting version information
 - fetching ensembl-io
 - unpacking ./Bio/tmp/ensembl-io.zip
 - moving files
 - getting version information

Testing VEP installation
./t/Utils.t .......................................... ok    
./t/Parser_Region.t .................................. Possible precedence issue with control flow operator at Bio/DB/IndexedBase.pm line 791.
./t/Parser_Region.t .................................. ok    
./t/FilterSet.t ...................................... ok     
./t/AnnotationSource_Cache.t ......................... Possible precedence issue with control flow operator at Bio/DB/IndexedBase.pm line 791.
./t/AnnotationSource_Cache.t ......................... ok   
./t/AnnotationSource_Database_Variation.t ............ Possible precedence issue with control flow operator at Bio/DB/IndexedBase.pm line 791.
./t/AnnotationSource_Database_Variation.t ............ ok    
./t/Haplo_Parser_VCF.t ............................... Possible precedence issue with control flow operator at Bio/DB/IndexedBase.pm line 791.
./t/Haplo_Parser_VCF.t ............................... ok    
./t/Haplo_InputBuffer.t .............................. Possible precedence issue with control flow operator at Bio/DB/IndexedBase.pm line 791.
./t/Haplo_InputBuffer.t .............................. 1/? Smartmatch is experimental at /Users/bms20/Desktop/ensembl-vep/modules/Bio/EnsEMBL/VEP/AnnotationSource/File.pm line 472.
./t/Haplo_InputBuffer.t .............................. ok    
./t/Haplo_AnnotationSource_File_GFF.t ................ Possible precedence issue with control flow operator at Bio/DB/IndexedBase.pm line 791.
./t/Haplo_AnnotationSource_File_GFF.t ................ 1/? Smartmatch is experimental at /Users/bms20/Desktop/ensembl-vep/modules/Bio/EnsEMBL/VEP/AnnotationSource/File.pm line 472.
./t/Haplo_AnnotationSource_File_GFF.t ................ ok    
./t/AnnotationSource_Database_RegFeat.t .............. Possible precedence issue with control flow operator at Bio/DB/IndexedBase.pm line 791.
./t/AnnotationSource_Database_RegFeat.t .............. ok    
./t/OutputFactory_Tab.t .............................. Possible precedence issue with control flow operator at Bio/DB/IndexedBase.pm line 791.
./t/OutputFactory_Tab.t .............................. 1/? Smartmatch is experimental at /Users/bms20/Desktop/ensembl-vep/modules/Bio/EnsEMBL/VEP/AnnotationSource/File.pm line 472.
./t/OutputFactory_Tab.t .............................. ok    
./t/version.t ........................................ ok   
./t/AnnotationSource_File_BigWig.t ................... Possible precedence issue with control flow operator at Bio/DB/IndexedBase.pm line 791.
Smartmatch is experimental at /Users/bms20/Desktop/ensembl-vep/modules/Bio/EnsEMBL/VEP/AnnotationSource/File.pm line 472.
./t/AnnotationSource_File_BigWig.t ................... ok    
./t/AnnotationSource_Cache_VariationTabix.t .......... Possible precedence issue with control flow operator at Bio/DB/IndexedBase.pm line 791.
./t/AnnotationSource_Cache_VariationTabix.t .......... ok    
./t/OutputFactory_VEP_output.t ....................... Possible precedence issue with control flow operator at Bio/DB/IndexedBase.pm line 791.
./t/OutputFactory_VEP_output.t ....................... 1/? Smartmatch is experimental at /Users/bms20/Desktop/ensembl-vep/modules/Bio/EnsEMBL/VEP/AnnotationSource/File.pm line 472.
./t/OutputFactory_VEP_output.t ....................... ok    
./t/Config.t ......................................... Possible precedence issue with control flow operator at Bio/DB/IndexedBase.pm line 791.
./t/Config.t ......................................... ok    
./t/TranscriptTree.t ................................. Possible precedence issue with control flow operator at Bio/DB/IndexedBase.pm line 791.
./t/TranscriptTree.t ................................. 1/? Smartmatch is experimental at /Users/bms20/Desktop/ensembl-vep/modules/Bio/EnsEMBL/VEP/AnnotationSource/File.pm line 472.
./t/TranscriptTree.t ................................. ok    
./t/InputBuffer.t .................................... Possible precedence issue with control flow operator at Bio/DB/IndexedBase.pm line 791.
./t/InputBuffer.t .................................... ok    
./t/OutputFactory_JSON.t ............................. Possible precedence issue with control flow operator at Bio/DB/IndexedBase.pm line 791.
./t/OutputFactory_JSON.t ............................. ok    
./t/Parser.t ......................................... Possible precedence issue with control flow operator at Bio/DB/IndexedBase.pm line 791.
./t/Parser.t ......................................... ok     
./t/VariantRecoder.t ................................. Possible precedence issue with control flow operator at Bio/DB/IndexedBase.pm line 791.
Smartmatch is experimental at /Users/bms20/Desktop/ensembl-vep/modules/Bio/EnsEMBL/VEP/AnnotationSource/File.pm line 472.
"my" variable $sth masks earlier declaration in same scope at Bio/EnsEMBL/DBSQL/TranslationAdaptor.pm line 607.
./t/VariantRecoder.t ................................. ok    
./t/AnnotationSource_File.t .......................... Possible precedence issue with control flow operator at Bio/DB/IndexedBase.pm line 791.
Smartmatch is experimental at /Users/bms20/Desktop/ensembl-vep/modules/Bio/EnsEMBL/VEP/AnnotationSource/File.pm line 472.
./t/AnnotationSource_File.t .......................... ok    
./t/Parser_ID.t ...................................... Possible precedence issue with control flow operator at Bio/DB/IndexedBase.pm line 791.
./t/Parser_ID.t ...................................... ok   
./t/OutputFactory.t .................................. Possible precedence issue with control flow operator at Bio/DB/IndexedBase.pm line 791.
./t/OutputFactory.t .................................. 1/? Smartmatch is experimental at /Users/bms20/Desktop/ensembl-vep/modules/Bio/EnsEMBL/VEP/AnnotationSource/File.pm line 472.
./t/OutputFactory.t .................................. ok     
./t/Haplo_Runner.t ................................... Possible precedence issue with control flow operator at Bio/DB/IndexedBase.pm line 791.
./t/Haplo_Runner.t ................................... 1/? Smartmatch is experimental at /Users/bms20/Desktop/ensembl-vep/modules/Bio/EnsEMBL/VEP/AnnotationSource/File.pm line 472.
./t/Haplo_Runner.t ................................... ok    
./t/Haplo_AnnotationSource_File_GTF.t ................ Possible precedence issue with control flow operator at Bio/DB/IndexedBase.pm line 791.
./t/Haplo_AnnotationSource_File_GTF.t ................ 1/? Smartmatch is experimental at /Users/bms20/Desktop/ensembl-vep/modules/Bio/EnsEMBL/VEP/AnnotationSource/File.pm line 472.
./t/Haplo_AnnotationSource_File_GTF.t ................ ok    
./t/Haplo_AnnotationSource_Cache_Transcript.t ........ Possible precedence issue with control flow operator at Bio/DB/IndexedBase.pm line 791.
./t/Haplo_AnnotationSource_Cache_Transcript.t ........ 1/? Smartmatch is experimental at /Users/bms20/Desktop/ensembl-vep/modules/Bio/EnsEMBL/VEP/AnnotationSource/File.pm line 472.
./t/Haplo_AnnotationSource_Cache_Transcript.t ........ ok    
./t/Parser_VEP_input.t ............................... Possible precedence issue with control flow operator at Bio/DB/IndexedBase.pm line 791.
./t/Parser_VEP_input.t ............................... ok    
./t/Parser_SPDI.t .................................... Possible precedence issue with control flow operator at Bio/DB/IndexedBase.pm line 791.
./t/Parser_SPDI.t .................................... ok   
./t/CacheDir.t ....................................... Possible precedence issue with control flow operator at Bio/DB/IndexedBase.pm line 791.
./t/CacheDir.t ....................................... ok    
./t/AnnotationSource_Database_Transcript.t ........... Possible precedence issue with control flow operator at Bio/DB/IndexedBase.pm line 791.
./t/AnnotationSource_Database_Transcript.t ........... ok    
./t/Runner.t ......................................... Possible precedence issue with control flow operator at Bio/DB/IndexedBase.pm line 791.
Smartmatch is experimental at /Users/bms20/Desktop/ensembl-vep/modules/Bio/EnsEMBL/VEP/AnnotationSource/File.pm line 472.
./t/Runner.t ......................................... 58/? 
-------------------- EXCEPTION --------------------
MSG: ERROR: Cannot use format gff without Bio::DB::HTS::Tabix module installed

STACK Bio::EnsEMBL::VEP::AnnotationSource::File::new /Users/bms20/Desktop/ensembl-vep/modules/Bio/EnsEMBL/VEP/AnnotationSource/File.pm:177
STACK Bio::EnsEMBL::VEP::AnnotationSourceAdaptor::get_all_custom /Users/bms20/Desktop/ensembl-vep/modules/Bio/EnsEMBL/VEP/AnnotationSourceAdaptor.pm:272
STACK Bio::EnsEMBL::VEP::AnnotationSourceAdaptor::get_all /Users/bms20/Desktop/ensembl-vep/modules/Bio/EnsEMBL/VEP/AnnotationSourceAdaptor.pm:94
STACK Bio::EnsEMBL::VEP::BaseRunner::get_all_AnnotationSources /Users/bms20/Desktop/ensembl-vep/modules/Bio/EnsEMBL/VEP/BaseRunner.pm:170
STACK Bio::EnsEMBL::VEP::Runner::init /Users/bms20/Desktop/ensembl-vep/modules/Bio/EnsEMBL/VEP/Runner.pm:128
STACK Bio::EnsEMBL::VEP::Runner::next_output_line /Users/bms20/Desktop/ensembl-vep/modules/Bio/EnsEMBL/VEP/Runner.pm:362
STACK toplevel ./t/Runner.t:831
Date (localtime)    = Wed Nov 22 22:07:48 2023
Ensembl API version = 110
---------------------------------------------------
# Tests were run but no plan was declared and done_testing() was not seen.
# Looks like your test exited with 255 just after 81.
./t/Runner.t ......................................... Dubious, test returned 255 (wstat 65280, 0xff00)
All 81 subtests passed 
./t/AnnotationSource_Database_StructuralVariation.t .. Possible precedence issue with control flow operator at Bio/DB/IndexedBase.pm line 791.
./t/AnnotationSource_Database_StructuralVariation.t .. ok    
./t/AnnotationSource_Cache_RegFeat.t ................. Possible precedence issue with control flow operator at Bio/DB/IndexedBase.pm line 791.
./t/AnnotationSource_Cache_RegFeat.t ................. ok    
./t/AnnotationSource_File_GFF.t ...................... Possible precedence issue with control flow operator at Bio/DB/IndexedBase.pm line 791.
Smartmatch is experimental at /Users/bms20/Desktop/ensembl-vep/modules/Bio/EnsEMBL/VEP/AnnotationSource/File.pm line 472.
./t/AnnotationSource_File_GFF.t ...................... ok    
./t/Parser_HGVS.t .................................... Possible precedence issue with control flow operator at Bio/DB/IndexedBase.pm line 791.
./t/Parser_HGVS.t .................................... ok    
./t/AnnotationSource_File_VCF.t ...................... Possible precedence issue with control flow operator at Bio/DB/IndexedBase.pm line 791.
Smartmatch is experimental at /Users/bms20/Desktop/ensembl-vep/modules/Bio/EnsEMBL/VEP/AnnotationSource/File.pm line 472.
./t/AnnotationSource_File_VCF.t ...................... ok    
./t/bam_edit.t ....................................... Possible precedence issue with control flow operator at Bio/DB/IndexedBase.pm line 791.
Smartmatch is experimental at /Users/bms20/Desktop/ensembl-vep/modules/Bio/EnsEMBL/VEP/AnnotationSource/File.pm line 472.
./t/bam_edit.t ....................................... ok    
./t/OutputFactory_VCF.t .............................. Possible attempt to separate words with commas at ./t/OutputFactory_VCF.t line 598.
Possible precedence issue with control flow operator at Bio/DB/IndexedBase.pm line 791.
./t/OutputFactory_VCF.t .............................. 1/? Smartmatch is experimental at /Users/bms20/Desktop/ensembl-vep/modules/Bio/EnsEMBL/VEP/AnnotationSource/File.pm line 472.
./t/OutputFactory_VCF.t .............................. ok    
./t/AnnotationSourceAdaptor.t ........................ Possible precedence issue with control flow operator at Bio/DB/IndexedBase.pm line 791.
Smartmatch is experimental at /Users/bms20/Desktop/ensembl-vep/modules/Bio/EnsEMBL/VEP/AnnotationSource/File.pm line 472.
./t/AnnotationSourceAdaptor.t ........................ ok    
./t/Stats.t .......................................... Possible precedence issue with control flow operator at Bio/DB/IndexedBase.pm line 791.
./t/Stats.t .......................................... 1/? Smartmatch is experimental at /Users/bms20/Desktop/ensembl-vep/modules/Bio/EnsEMBL/VEP/AnnotationSource/File.pm line 472.
./t/Stats.t .......................................... ok    
./t/AnnotationSource_Cache_Transcript.t .............. Possible precedence issue with control flow operator at Bio/DB/IndexedBase.pm line 791.
./t/AnnotationSource_Cache_Transcript.t .............. ok    
./t/AnnotationSource_File_BED.t ...................... Possible precedence issue with control flow operator at Bio/DB/IndexedBase.pm line 791.
Smartmatch is experimental at /Users/bms20/Desktop/ensembl-vep/modules/Bio/EnsEMBL/VEP/AnnotationSource/File.pm line 472.
./t/AnnotationSource_File_BED.t ...................... ok    
./t/AnnotationSource.t ............................... Possible precedence issue with control flow operator at Bio/DB/IndexedBase.pm line 791.
./t/AnnotationSource.t ............................... ok    
./t/AnnotationSource_Cache_Variation.t ............... Possible precedence issue with control flow operator at Bio/DB/IndexedBase.pm line 791.
./t/AnnotationSource_Cache_Variation.t ............... ok    
./t/Parser_VCF.t ..................................... Possible precedence issue with control flow operator at Bio/DB/IndexedBase.pm line 791.
./t/Parser_VCF.t ..................................... ok    
./t/Parser_CAID.t .................................... Possible precedence issue with control flow operator at Bio/DB/IndexedBase.pm line 791.
./t/Parser_CAID.t .................................... ok    
./t/BaseVEP.t ........................................ Possible precedence issue with control flow operator at Bio/DB/IndexedBase.pm line 791.
./t/BaseVEP.t ........................................ 1/? Smartmatch is experimental at /Users/bms20/Desktop/ensembl-vep/modules/Bio/EnsEMBL/VEP/AnnotationSource/File.pm line 472.
./t/BaseVEP.t ........................................ ok    
./t/AnnotationSource_File_GTF.t ...................... Possible precedence issue with control flow operator at Bio/DB/IndexedBase.pm line 791.
Smartmatch is experimental at /Users/bms20/Desktop/ensembl-vep/modules/Bio/EnsEMBL/VEP/AnnotationSource/File.pm line 472.
./t/AnnotationSource_File_GTF.t ...................... ok    
./t/Haplo_AnnotationSource_Database_Transcript.t ..... Possible precedence issue with control flow operator at Bio/DB/IndexedBase.pm line 791.
./t/Haplo_AnnotationSource_Database_Transcript.t ..... ok    

Test Summary Report
-------------------
./t/Runner.t                                       (Wstat: 65280 (exited 255) Tests: 81 Failed: 0)
  Non-zero exit status: 255
  Parse errors: No plan found in TAP output
Files=49, Tests=1870, 61 wallclock secs ( 0.12 usr  0.15 sys + 53.39 cusr  4.10 csys = 57.76 CPU)
Result: FAIL
Failed 1/49 test programs. 0/1870 subtests failed.

Run test:

./vep -i examples/homo_sapiens_GRCh38.vcf --cache

In R:

if(!require("BiocManager")) install.packages("BiocManager")
BiocManager::install("ensemblVEP")

Method 2: conda environment

ensembl-vep conda package doesn't seem to work, evenn with a minial yaml.

yml file:

name: vep
channels:
  - conda-forge
  - bioconda 
  - nodefaults 
dependencies:
  - ensembl-vep
UnsatisfiableError: 

(base) bms20@IC-WPG44XK9L1 ensembl-vep % conda env create -f /Users/bms20/Desktop/echoverse/echoconda/inst/conda/vep.yml 
Collecting package metadata (repodata.json): done
Solving environment: / 
Found conflicts! Looking for incompatible packages.
This can take several minutes.  Press CTRL-C to abort.
failed                                                                                                                                                                                             
Solving environment: - 
Found conflicts! Looking for incompatible packages.
This can take several minutes.  Press CTRL-C to abort.
failed                                                                                                                                                                                             

UnsatisfiableError: 

Even the Docker container fails to start!

docker pull ensemblorg/ensembl-vep

Using default tag: latest
latest: Pulling from ensemblorg/ensembl-vep
064a9bb4736d: Pull complete 
7d12211b745c: Pull complete 
307de17ee77e: Pull complete 
8d4a0e7e512f: Pull complete 
4f4fb700ef54: Pull complete 
6b2174a10aca: Pull complete 
8d251584443b: Pull complete 
f809bf1562aa: Pull complete 
e50449ba073a: Pull complete 
ef9d46de6f67: Pull complete 
0d33d69321d4: Pull complete 
2dac51a30954: Pull complete 
910756a869dc: Pull complete 
c887c64cf3b6: Pull complete 
41a01c89500b: Pull complete 
Digest: sha256:eb2c980f9150069212300bd8c5844283d83fc5920cce7d00884cc2c6f0af6759
Status: Downloaded newer image for ensemblorg/ensembl-vep:latest
docker.io/ensemblorg/ensembl-vep:latest
bms20@IC-WPG44XK9L1 Desktop % docker start c8c448156cc6 

Error response from daemon: No such container: c8c448156cc6
Error: failed to start containers: c8c448156cc6
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment