Skip to content

Instantly share code, notes, and snippets.

@EBIshengoma
Last active August 29, 2015 13:56
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save EBIshengoma/efc4ad3e32427891931d to your computer and use it in GitHub Desktop.
Save EBIshengoma/efc4ad3e32427891931d to your computer and use it in GitHub Desktop.
Script for concatenating fragemented tblastn output_not working though
from Bio.Blast import NCBIXML
input_path = '/home/edson/tblastn_results_test/Sample_output.xml'
with open(input_path) as tblastn_file:
records = NCBIXML.parse(tblastn_file)
for record in records:
for alignment in record.alignments:
hits = sorted((hsp.query_start, hsp.query_end, hsp.sbjct_start, hsp.sbjct_end, alignment.title, hsp.query, hsp.sbjct)\
for hsp in alignment.hsps) # sorting results according to positions
complete_query_seq = ''
complete_sbjct_seq =''
for q_start, q_end, sb_start, sb_end, title, query, sbjct in hits:
print title
print 'The query starts from position: ' + str(q_start)
print 'The query ends at position: ' + str(q_end)
print 'The hit starts at position: ' + str(sb_start)
print 'The hit ends at position: ' + str(sb_end)
print 'The query is: ' + query
print 'The hit is: ' + sbjct
complete_query_seq += str(query[q_start-1:q_end]) # concatenating subsequent query/subject portions with alignments
complete_sbjct_seq += str(sbjct[sb_start-1:sb_end])
print 'Complete query seq is: ' + complete_query_seq
print 'Complete subject seq is: ' + complete_sbjct_seq
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment