Skip to content

Instantly share code, notes, and snippets.

@willis7
Created August 20, 2018 10:14
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save willis7/883c42fcac8608097e46b13a1abd8787 to your computer and use it in GitHub Desktop.
Save willis7/883c42fcac8608097e46b13a1abd8787 to your computer and use it in GitHub Desktop.
search for duplicate words
#!/usr/bin/env perl
# Finds duplicate adjacent words.
use strict ;
my $DupCount = 0 ;
if (!@ARGV) {
print "usage: dups <file> ...\n" ;
exit ;
}
while (1) {
my $FileName = shift @ARGV ;
# Exit code = number of duplicates found.
exit $DupCount if (!$FileName) ;
open FILE, $FileName or die $!;
my $LastWord = "" ;
my $LineNum = 0 ;
while (<FILE>) {
chomp ;
$LineNum ++ ;
my @words = split (/(\W+)/) ;
foreach my $word (@words) {
# Skip spaces:
next if $word =~ /^\s*$/ ;
# Skip punctuation:
if ($word =~ /^\W+$/) {
$LastWord = "" ;
next ;
}
# Found a dup?
if (lc($word) eq lc($LastWord)) {
print "$FileName:$LineNum $word\n" ;
$DupCount ++ ;
} # Thanks to Sean Cronin for tip on case.
# Mark this as the last word:
$LastWord = $word ;
}
}
close FILE ;
}
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment