Skip to content

Instantly share code, notes, and snippets.

@armanbilge
Created July 30, 2013 03:56
Show Gist options
  • Save armanbilge/6110107 to your computer and use it in GitHub Desktop.
Save armanbilge/6110107 to your computer and use it in GitHub Desktop.
A script to deinterleave phylip-formatted sequence files.
#!/usr/bin/env python
import sys
try:
stream = open(sys.argv[1], 'r')
except:
stream = sys.stdin
is_header = True
is_first_row = True
lines = []
for l in stream:
if is_header:
print l[:-1]
is_header = False
else:
if l.lstrip() == '':
is_first_row = False
i = 0
else:
if is_first_row:
lines.append(l[:-1])
else:
lines[i] += ' ' + l[:-1].lstrip()
i += 1
if stream is not sys.stdin:
stream.close()
for l in lines: print l
@dysh
Copy link

dysh commented May 3, 2014

Hi! Some programs like TCS do not tolerate spaces in sequences. Therefore the last line may be substituted with something like:

for l in lines: print l[0:10]+l[11:-1].replace(" ","")

cheers

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment