Skip to content

Instantly share code, notes, and snippets.

@philippbayer
Created June 11, 2015 21:01
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save philippbayer/ee424b4e76e6d6e7a71c to your computer and use it in GitHub Desktop.
Save philippbayer/ee424b4e76e6d6e7a71c to your computer and use it in GitHub Desktop.
getting 1000 genomes data with tabix and python
urls={"1":"ftp://ftp-trace.ncbi.nih.gov/1000genomes/ftp/release/20130502/ALL.chr1.phase3_shapeit2_mvncall_integrated_v5a.20130502.genotypes.vcf.gz",
"10":"ftp://ftp-trace.ncbi.nih.gov/1000genomes/ftp/release/20130502/ALL.chr10.phase3_shapeit2_mvncall_integrated_v5a.20130502.genotypes.vcf.gz",
"11":"ftp://ftp-trace.ncbi.nih.gov/1000genomes/ftp/release/20130502/ALL.chr11.phase3_shapeit2_mvncall_integrated_v5a.20130502.genotypes.vcf.gz",
"12":"ftp://ftp-trace.ncbi.nih.gov/1000genomes/ftp/release/20130502/ALL.chr12.phase3_shapeit2_mvncall_integrated_v5a.20130502.genotypes.vcf.gz",
"13":"ftp://ftp-trace.ncbi.nih.gov/1000genomes/ftp/release/20130502/ALL.chr13.phase3_shapeit2_mvncall_integrated_v5a.20130502.genotypes.vcf.gz",
"14":"ftp://ftp-trace.ncbi.nih.gov/1000genomes/ftp/release/20130502/ALL.chr14.phase3_shapeit2_mvncall_integrated_v5a.20130502.genotypes.vcf.gz",
"15":"ftp://ftp-trace.ncbi.nih.gov/1000genomes/ftp/release/20130502/ALL.chr15.phase3_shapeit2_mvncall_integrated_v5a.20130502.genotypes.vcf.gz",
"16":"ftp://ftp-trace.ncbi.nih.gov/1000genomes/ftp/release/20130502/ALL.chr16.phase3_shapeit2_mvncall_integrated_v5a.20130502.genotypes.vcf.gz",
"17":"ftp://ftp-trace.ncbi.nih.gov/1000genomes/ftp/release/20130502/ALL.chr17.phase3_shapeit2_mvncall_integrated_v5a.20130502.genotypes.vcf.gz",
"18":"ftp://ftp-trace.ncbi.nih.gov/1000genomes/ftp/release/20130502/ALL.chr18.phase3_shapeit2_mvncall_integrated_v5a.20130502.genotypes.vcf.gz",
"19":"ftp://ftp-trace.ncbi.nih.gov/1000genomes/ftp/release/20130502/ALL.chr19.phase3_shapeit2_mvncall_integrated_v5a.20130502.genotypes.vcf.gz",
"2":"ftp://ftp-trace.ncbi.nih.gov/1000genomes/ftp/release/20130502/ALL.chr2.phase3_shapeit2_mvncall_integrated_v5a.20130502.genotypes.vcf.gz",
"20":"ftp://ftp-trace.ncbi.nih.gov/1000genomes/ftp/release/20130502/ALL.chr20.phase3_shapeit2_mvncall_integrated_v5a.20130502.genotypes.vcf.gz",
"21":"ftp://ftp-trace.ncbi.nih.gov/1000genomes/ftp/release/20130502/ALL.chr21.phase3_shapeit2_mvncall_integrated_v5a.20130502.genotypes.vcf.gz",
"22":"ftp://ftp-trace.ncbi.nih.gov/1000genomes/ftp/release/20130502/ALL.chr22.phase3_shapeit2_mvncall_integrated_v5a.20130502.genotypes.vcf.gz",
"3":"ftp://ftp-trace.ncbi.nih.gov/1000genomes/ftp/release/20130502/ALL.chr3.phase3_shapeit2_mvncall_integrated_v5a.20130502.genotypes.vcf.gz",
"4":"ftp://ftp-trace.ncbi.nih.gov/1000genomes/ftp/release/20130502/ALL.chr4.phase3_shapeit2_mvncall_integrated_v5a.20130502.genotypes.vcf.gz",
"5":"ftp://ftp-trace.ncbi.nih.gov/1000genomes/ftp/release/20130502/ALL.chr5.phase3_shapeit2_mvncall_integrated_v5a.20130502.genotypes.vcf.gz",
"6":"ftp://ftp-trace.ncbi.nih.gov/1000genomes/ftp/release/20130502/ALL.chr6.phase3_shapeit2_mvncall_integrated_v5a.20130502.genotypes.vcf.gz",
"7":"ftp://ftp-trace.ncbi.nih.gov/1000genomes/ftp/release/20130502/ALL.chr7.phase3_shapeit2_mvncall_integrated_v5a.20130502.genotypes.vcf.gz",
"8":"ftp://ftp-trace.ncbi.nih.gov/1000genomes/ftp/release/20130502/ALL.chr8.phase3_shapeit2_mvncall_integrated_v5a.20130502.genotypes.vcf.gz",
"9":"ftp://ftp-trace.ncbi.nih.gov/1000genomes/ftp/release/20130502/ALL.chr9.phase3_shapeit2_mvncall_integrated_v5a.20130502.genotypes.vcf.gz"}
chroms = {}
for l in open("your_plink_file.map"):
ll = l.rstrip().split("\t")
chrom, pos = ll[0], ll[-1]
if chrom in chroms:
chroms[chrom].append("%s:%s-%s"%(chrom, pos, pos))
else:
chroms[chrom] = ["%s:%s-%s"%(chrom, pos, pos)]
import os
for chrom in chroms:
print "getting %s SNPS for chromosome %s" %(len(chroms[chrom]), chrom)
command = "./tabix-0.2.6/tabix -fh %s %s | grep -v SV > %s.vcf" %(urls[chrom], " ".join(chroms[chrom]), chrom)
os.popen(command)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment