Skip to content

Instantly share code, notes, and snippets.

@standage
Created April 27, 2013 18:47
Show Gist options
  • Save standage/5474155 to your computer and use it in GitHub Desktop.
Save standage/5474155 to your computer and use it in GitHub Desktop.
A script to process ParsEval output and determine unmatched genes; i.e., reference genes for which there are no overlapping prediction genes, and vice versa; this script expects ParsEval output in text format. Usage: perl pe-uniq.pl < pe-out.txt > uniq-genes.txt
#!/usr/bin/env perl
# pe-uniq.pl: a script to process ParsEval output and determine unmatched genes;
# i.e., reference genes for which there are no overlapping prediction genes, and
# vice versa; this script expects ParsEval output in text format
#
# Usage: perl pe-uniq.pl < pe-out.txt > uniq-genes.txt
use strict;
my $locusmatch = quotemeta("|---- Locus:");
while(my $line = <STDIN>)
{
my @refrgenes;
my @predgenes;
my $locus;
# start looking for unique genes at the beginning of each locus record;
# ignore everything else
if($line =~ m/$locusmatch/)
{
$locus = $line;
# Grab reference genes
skip_lines(3, \*STDIN);
while($line = <STDIN>)
{
last if($line =~ m/^\|$/);
$line =~ s/\|\s+(.+)\s*/$1/;
push(@refrgenes, $line) unless($line eq "None!");
}
# Grab prediction genes
skip_lines(1, \*STDIN);
while($line = <STDIN>)
{
last if($line =~ m/^\|$/);
$line =~ s/\|\s+(.+)\s*/$1/;
push(@predgenes, $line) unless($line eq "None!");
}
# Print out unique genes, if any
my $numrefr = scalar(@refrgenes);
my $numpred = scalar(@predgenes);
die("error: 0 genes at $locus") if($numrefr == 0 and $numpred == 0);
if($numrefr == 0)
{
foreach my $pred(@predgenes)
{
print("pred\t$pred\n");
}
}
elsif($numpred == 0)
{
foreach my $refr(@refrgenes)
{
print("refr\t$refr\n");
}
}
}
}
sub skip_lines
{
my($numskip, $FH) = @_;
while($numskip > 0)
{
my $line = <$FH>;
$numskip--;
}
}
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment