Skip to content

Instantly share code, notes, and snippets.

Embed
What would you like to do?
Quick hack for a Biopython conversion question http://mailman.open-bio.org/pipermail/biopython/2015-June/015648.html
from Bio import SeqIO
with open("CP008802.txt", "w") as output:
output.write("Seqname\tSource\tfeature\tStart\tEnd\tScore\tStrand\tFrame\tAttributes\n")
for record in SeqIO.parse("CP008802.gbk", "genbank"):
print("Converting %s" % record.name)
for f in record.features:
if f.type != "gene":
continue
locus_tag = f.qualifiers["locus_tag"][0]
if len(f.location.parts) > 1:
print("What should we do for %s (compound location)? %s" % (locus_tag, f.location))
continue
output.write('%s\tGenBank\t%s\t%i\t%i\t0,000000\t%s\t.\tlocus_tag\t"%s"; transcript_id "%s"\n'
% (record.name, f.type,
f.location.start + 1, f.location.end, f.location.strand,
locus_tag, locus_tag))
print("Done")
@peterjc

This comment has been minimized.

Copy link
Owner Author

@peterjc peterjc commented Jun 2, 2015

Assuming you've saved http://www.ncbi.nlm.nih.gov/nuccore/CP008802 as a plain text GenBank format file in the current directory as CP008802.gbk, this will write CP008802.txt and print the following on screen:

$ python genbank_to_table.py
Converting CP008802
What should we do for FB03_00005 (compound location)? join{[0:6](-), [2158322:2159306](-)}
Done
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment