Skip to content

Instantly share code, notes, and snippets.

@JoaoRodrigues
Created June 5, 2019 19:43
Show Gist options
  • Save JoaoRodrigues/73cecd4dd664cde925eef2dd722411e9 to your computer and use it in GitHub Desktop.
Save JoaoRodrigues/73cecd4dd664cde925eef2dd722411e9 to your computer and use it in GitHub Desktop.
Using Biopython (Bio.Entrez) to fetch sequences from NCBI
from Bio import Entrez
Entrez.email = None # change this!
assert Entrez.email is not None
# Read ids from file to a list
id_list = [ ... ]
results = {}
handle = Entrez.efetch(db="protein", id=id_list, retmode="xml")
with handle:
records = Entrez.read(handle)
for id_i, rec in zip(id_list, records):
seq = rec['GBSeq_sequence'].upper()
results[id_i] = seq
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment