Skip to content

Instantly share code, notes, and snippets.

@JamieHeather
Last active February 26, 2017 00:40
Show Gist options
  • Save JamieHeather/b03cc8a330a69c622c3e5ffbc8fb7550 to your computer and use it in GitHub Desktop.
Save JamieHeather/b03cc8a330a69c622c3e5ffbc8fb7550 to your computer and use it in GitHub Desktop.
Download specific human DNA sequences from hg19
"""
get_hg19_sequence.py
Jamie Heather, February 2017
For use on Python 2.7, requires urllib2 module
"""
import urllib2
def get_hg19_seq(chrm, seq_from, seq_to):
"""
Takes a chromosome number or name (1-22, X/Y/M) and two coordinates (from/to)
Returns the corresponding hg19 nucleotide sequence via the UCSC DAS server.
"""
base_url = "http://genome.ucsc.edu/cgi-bin/das/hg19/dna?segment=chr"
page = urllib2.urlopen(base_url + str(chrm) + ":" + str(seq_from) + "," + str(seq_to))
contents = []
for line in page:
if "<" not in line:
contents.append(line.rstrip())
full_seq = "".join(contents).upper()
return full_seq
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment