Last active
February 26, 2017 00:40
-
-
Save JamieHeather/b03cc8a330a69c622c3e5ffbc8fb7550 to your computer and use it in GitHub Desktop.
Download specific human DNA sequences from hg19
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
""" | |
get_hg19_sequence.py | |
Jamie Heather, February 2017 | |
For use on Python 2.7, requires urllib2 module | |
""" | |
import urllib2 | |
def get_hg19_seq(chrm, seq_from, seq_to): | |
""" | |
Takes a chromosome number or name (1-22, X/Y/M) and two coordinates (from/to) | |
Returns the corresponding hg19 nucleotide sequence via the UCSC DAS server. | |
""" | |
base_url = "http://genome.ucsc.edu/cgi-bin/das/hg19/dna?segment=chr" | |
page = urllib2.urlopen(base_url + str(chrm) + ":" + str(seq_from) + "," + str(seq_to)) | |
contents = [] | |
for line in page: | |
if "<" not in line: | |
contents.append(line.rstrip()) | |
full_seq = "".join(contents).upper() | |
return full_seq |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment