Skip to content

Instantly share code, notes, and snippets.

@jrherr
Created February 8, 2015 16:55
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save jrherr/2c5fed226a1a63f5643a to your computer and use it in GitHub Desktop.
Save jrherr/2c5fed226a1a63f5643a to your computer and use it in GitHub Desktop.
Rename SPAdes output fasta headers line for pipeline
sed -r "s/>NODE(_[0-9]+)_(.*)/>${input.name}\1 \2/g" $input > $output
@jrherr
Copy link
Author

jrherr commented Feb 8, 2015

The fasta headers from SPAdes output look like this:

>NODE_100_length_628_cov_0.818363_ID_199

I am assuming NODE is the de Bruijn graph node, length is the contig length, coverage is the percent read coverage for the contig, and ID is (I guess) an arbitrary ID from SPAdes.

See here: http://bit.ly/1xQcAzm

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment