Skip to content

Instantly share code, notes, and snippets.

@soren
Created Nov 22, 2013
Embed
What would you like to do?
A Perl Word Count mapper script. Can be used as a mapper in Hadoop using the Streaming interface. Tested with Java 1.6 and Hadoop 1.0.4.
#!/usr/bin/env perl
use warnings;
use strict;
while (<>) {
chomp;
print lc $_,"\t1\n" foreach split /[\s.,:;!?]+/;
}
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment