Skip to content

Instantly share code, notes, and snippets.

Last active Mar 3, 2022
What would you like to do?
How to linearize a FASTA sequence using awk.

Linearize a fasta sequence

awk -f linearizefasta.awk < input.fa


awk '/^>/ {printf("%s%s\t",(N>0?"\n":""),$0);N++;next;} {printf("%s",$0);} END {printf("\n");}' < input.fa

Format back to fasta

tr "\t" "\n" < linearized.tsv

if you know your fasta header have a length < 60

tr "\t" "\n" < linearized.tsv | fold -w 60
/^>/ {printf("%s%s\t",(N>0?"\n":""),$0);N++;next;}
END {printf("\n");}
Copy link

upendrak commented Mar 16, 2018

Thanks. I find it quite useful

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment