Skip to content

Instantly share code, notes, and snippets.

Created March 26, 2014 18:09
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save anonymous/9789621 to your computer and use it in GitHub Desktop.
Save anonymous/9789621 to your computer and use it in GitHub Desktop.
#!/usr/bin/env perl
# The silva database has spaces between ever ten bases and there are U's
# where the T's are supposed to be. This script fixes those two problems.
use strict;
use warnings;
use BioUtils::FastaIO;
use BioUtils::FastaSeq;
my $usage = "$0 <fasta_file> <output file>\n";
my $fasta_file = shift or die $usage;
my $out_file = shift or die $usage;
# create the fasta IO objects
my $in = BioUtils::FastaIO->new({stream_type => '<', file => $fasta_file});
my $out = BioUtils::FastaIO->new({stream_type => '>', file => $out_file});
# read in the sequences
while ( my $seq = $in->get_next_seq() ) {
my $temp_seq = $seq->get_seq();
$temp_seq =~ tr/U/T/;
$temp_seq =~ s/ //g;
$seq->set_seq($temp_seq);
# output the formated sequence
$out->write_seq($seq);
}
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment