Skip to content

Instantly share code, notes, and snippets.

@simoncozens
Created March 19, 2014 03:07
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save simoncozens/9634781 to your computer and use it in GitHub Desktop.
Save simoncozens/9634781 to your computer and use it in GitHub Desktop.
use Lingua::StopWords q(getStopWords);
use Lingua::EN::Inflect::Number qw/to_S/;
my $stop = getStopWords("en");
$stop->{$_}++ for qw/thee thou ye shall unto hath/;
my %hash;
while (<>) {
chomp;
my @words = split /\s+/, $_;
for (@words) {
$_= to_S(lc $_);
next if $stop->{$_};
$hash{$_}++;
}
}
print "$_; $hash{$_}\n" for sort keys %hash;
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment