Skip to content

Instantly share code, notes, and snippets.

@danbri
Created January 12, 2011 16:17
Show Gist options
  • Save danbri/776369 to your computer and use it in GitHub Desktop.
Save danbri/776369 to your computer and use it in GitHub Desktop.
linkage = load '/user/danbri/wikipedia/dbpedia2twittername.txt.bz2' using PigStorage('\t') AS (dbpedia_entry: chararray, screen_name: chararray);
linked_topics = JOIN linkage by dbpedia_entry, WPSB1 by dbpedia_entry;
by_topics = GROUP linked_topics BY dbpedia_category;
cat_stats = FOREACH by_topics GENERATE group as category, COUNT(celeb_topics);
store cat_stats into 'edsu_topics';
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment