Skip to content

Instantly share code, notes, and snippets.

@johnsolk
Last active October 12, 2017 17:27
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save johnsolk/3958c91fe39c92e51cb4544d0b6b8f24 to your computer and use it in GitHub Desktop.
Save johnsolk/3958c91fe39c92e51cb4544d0b6b8f24 to your computer and use it in GitHub Desktop.
# From within a py3 virtualenv
# or
# sudo pip install screed
# source ~/bin/py3/bin/activate
import screed
with open("table.txt",'w') as fq:
for r in screed.open('trinity.nema.full.fasta'):
fq.write(r.name+"\n")
@johnsolk
Copy link
Author

cat table.txt | cut -d" " -f1 > table1.txt
cat table.txt | cut -d" " -f1 > table2.txt
cat table1.txt | cut -d"_" -f1 | paste table2.txt - > table3.txt
rm -rf table.txt table1.txt table2.txt
mv table3.txt nema_transcript_gene_id.txt

@johnsolk
Copy link
Author

head nema_transcript_gene_id.txt
Output:

comp12_c0_seq1  comp12
comp16_c0_seq1  comp16
comp53_c0_seq1  comp53
comp53_c1_seq1  comp53
comp64_c0_seq1  comp64
comp67_c0_seq1  comp67
comp86_c0_seq1  comp86
comp90_c0_seq1  comp90
comp100_c0_seq1         comp100
comp104_c0_seq1         comp104

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment