Skip to content

Instantly share code, notes, and snippets.

@jbenninghoff
Created January 22, 2015 20:51
Show Gist options
  • Save jbenninghoff/13982fe5468c591c43df to your computer and use it in GitHub Desktop.
Save jbenninghoff/13982fe5468c591c43df to your computer and use it in GitHub Desktop.
Another wordcount in pig
hduser@master:~$ cat wordcount.pig
A = load '/user/jbenninghoff/somefile.txt';
B = foreach A generate flatten(TOKENIZE((chararray)$0)) as word;
C = filter B by word matches '\\w+';
D = group C by word;
E = foreach D generate COUNT(C), group;
store E into '/user/jbenninghoff/somefileWordcount';
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment