Skip to content

Instantly share code, notes, and snippets.

@radaniba
Created November 29, 2012 16:50
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save radaniba/4170309 to your computer and use it in GitHub Desktop.
Save radaniba/4170309 to your computer and use it in GitHub Desktop.
run patscan (to search for a pattern) on every sequence in a directory
#!/usr/bin/perl -w
# patscan_batch.pl
# Run patscan on all seqs in a folder
# Can be easily modified to run any command on every sequence in a folder
# WI Bioinformatics course - Feb 2002 - Lecture 5
# Revised - Sep 2003
################ User-supplied variables #############
# Directory of sequences
$myDir = "/home/elvis/seqs";
# Output directory (relative to $myDir or full path)
$outputDir = "patscan";
# Path to pattern file
$patFile = "/home/elvis/patterns/polyA.pat";
#########################################################
# Go to sequence directory and open it (i.e, read contents)
chdir($myDir) || die "Cannot change to $myDir: $!"; # Go to $myDir
opendir(DIR, $myDir) || die "Cannot open $myDir: $!"; # Open $myDir
foreach $seqFile (sort readdir(DIR))
{
if ($seqFile =~ /\.tfa$/) # if file ends in .tfa
{
print "Processing $seqFile\n";
$outFile = $seqFile; # Create $outFile name
$outFile =~ s/\.tfa/\.polyA\.out/; # s/old/new/;
############ Run PATSCAN ###############
`scan_for_matches $patFile < $seqFile > $outputDir/$outFile`;
}
}
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment