Skip to content

Instantly share code, notes, and snippets.

@pranavathiyani
Last active February 9, 2020 15:55
Show Gist options
  • Save pranavathiyani/03de484f7ac451d6baf0a35ba1637556 to your computer and use it in GitHub Desktop.
Save pranavathiyani/03de484f7ac451d6baf0a35ba1637556 to your computer and use it in GitHub Desktop.
Perl script for extracting hypothetical proteins from a multi-protein file. The input file must be in single-line fasta format.
#!/usr/bin/perl -w
#Executng the program requires sequence.fasta as single line fasta file
open(FH,"sequence.fasta") or die "file doesn't exist"; #sequence.fasta is the input file
open(OT,">>hypothetical_proteins.fasta");
@ff = <FH>;
$t="";
for($i=0;$i<@ff;$i++){
if($ff[$i]=~m/hypothetical/){
$t.= $ff[$i].$ff[$i+1];
}
}
print OT $t;
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment