Skip to content

Instantly share code, notes, and snippets.

@samesense
Last active December 21, 2019 14:43
Show Gist options
  • Save samesense/33e7a0ecc2fc43ef8bc556c3e643a291 to your computer and use it in GitHub Desktop.
Save samesense/33e7a0ecc2fc43ef8bc556c3e643a291 to your computer and use it in GitHub Desktop.
vcf-snp-check

vcf-snp-lookup

I hope you have some terminal/command line experience.

  • Navigate to your vcf.gz file in the terminal/command line.
  • Check your reference genome (install gzcat as needed):
gzcat {your_id}.filtered.snp.vcf.gz | head -100 | less

Look for something like -r /references/grch37

My file has the following

##DRAGENCommandLine=<ID=HashTableBuild,Version="SW: 01.003.044.3.4.11, HashTableVersion: 7",CommandLineOptions="dragen --build-hash-table true --ht-reference /staging/human_g1k_v37_decoy.fasta --output-directory /staging/grch37/ --enable-cnv true --enable-rna true">
##DRAGENCommandLine=<ID=dragen,Version="SW: 05.021.408.3.4.11, HW: 05.021.408",Date="Fri Nov 22 08:28:59 UTC 2019",CommandLineOptions="-f -r /references/grch37 --sv-reference /references/grch37.fasta ...
  • You need to know the chromosome and position of your variant.

If you have grch37, use GRCh37/hg19 coordinates. If you have grch38, use GRCh38/hg18 coordinates. Assume you focus on GRCh37:Chr19:97852

  • Grep for variant
gzcat {your_id}.filtered.snp.vcf.gz | grep ^19 | grep -w 97852 

Results

19      97852   .       G       T       7.4     PASS    AC=1;AF=0.5;AN=2;DP=8;FS=0;MQ=12.21;MQRankSum=0;QD=0.93;ReadPosRankSum=0.572;SOR=0.169;FractionInformativeReads=1;R2_5P_bias=20.767;VQSLOD=-1.40782 G
T:AD:AF:DP:F1R2:F2R1:GQ:PL:GP:PRI:SB:MB 0/1:6,2:0.25:8:1,2:5,0:7:41,0,14:7.4031,0.95742,17.957:0,34.77,37.77:0,6,0,2:5,1,0,2

Focus on 0/1:6,2:0.25:8. 8 is your total read depth. 6,2 means 6 Gs and 2 Ts. 0.25 * 8 = 2 reads are T.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment