How to linearize a FASTA sequence using awk.

Linearize a fasta sequence

awk -f linearizefasta.awk < input.fa


awk '/^>/ {printf("%s%s\t",(N>0?"\n":""),$0);N++;next;} {printf("%s",$0);} END {printf("\n");}' < input.fa

Format back to fasta

tr "\t" "\n" < linearized.tsv

if you know your fasta header have a length < 60

tr "\t" "\n" < linearized.tsv | fold -w 60
/^>/ {printf("%s%s\t",(N>0?"\n":""),$0);N++;next;}
END {printf("\n");}
upendrak commented Mar 16, 2018

Thanks. I find it quite useful

