Skip to content

Instantly share code, notes, and snippets.

@meren
Created October 31, 2012 18:48
Show Gist options
  • Save meren/3989041 to your computer and use it in GitHub Desktop.
Save meren/3989041 to your computer and use it in GitHub Desktop.
# This script takes a merged fasta file as an input, and puts reads that merged with 0 mismatch
# at the overlapped region into another fasta file. example commandline:
#
# python get_0_mismatches_from_merge.py INPUT_FASTA OUTPUT_FASTA
#
import sys
sys.path.append('/bioware/pythonmodules/fastalib/')
import fastalib as u
fasta = u.SequenceSource(sys.argv[1])
output = u.FastaOutput(sys.argv[2])
while fasta.next():
if fasta.pos % 10000 == 0:
sys.stdout.write('\r[%s] %d reads processed so far' % (sys.argv[2], fasta.pos))
sys.stdout.flush()
if fasta.id.endswith('mismatches:0') and fasta.seq.find('n') == -1:
output.store(fasta, split = False)
fasta.close()
print
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment