Skip to content

Instantly share code, notes, and snippets.

@stedy
Created November 11, 2010 00:21
Show Gist options
  • Select an option

  • Save stedy/671770 to your computer and use it in GitHub Desktop.

Select an option

Save stedy/671770 to your computer and use it in GitHub Desktop.
A script for getting ancestral alleles from dbSNP based on input list of RS #
#! usr/bin/python
#Filename: rsquery2.py
import re, urllib, time
rsnumbers = []
f = open('ancestral0624', 'a')
urltest = re.compile
prefix = "http://www.ncbi.nlm.nih.gov/projects/SNP/snp_ref.cgi?rs="
rsn = open("snplist")
for line in rsn:
rsnumbers.append(line)
for line in rsnumbers:
try:
entrez = urllib.urlopen(prefix + line).read()
time.sleep(.5)
text = entrez.split("Ancestral Allele")[-1].split("Clinical Association")[0]
output = text.split('f1f1f1">')[-1].split('</td></TR>')[0]
final = output, line
finalstr = str(final)
f.write("%s\n" % finalstr)
except IOError:
time.sleep(5)
f.close()
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment