Skip to content

Instantly share code, notes, and snippets.

@milimetric
Last active December 18, 2015 01:59
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save milimetric/5708271 to your computer and use it in GitHub Desktop.
Save milimetric/5708271 to your computer and use it in GitHub Desktop.
REGISTER 'kraken-pig-0.0.2-SNAPSHOT.jar'
REGISTER 'kraken-generic-0.0.2-SNAPSHOT-jar-with-dependencies.jar'
REGISTER 'geoip-1.2.5.jar'
IMPORT 'include/load_webrequest.pig';
SET default_parallel 2;
DEFINE TO_HOUR org.wikimedia.analytics.kraken.pig.ConvertDateFormat('yyyy-MM-dd\'T\'HH:mm:ss', 'yyyy-MM-dd_HH');
DEFINE EXTRACT org.apache.pig.builtin.REGEX_EXTRACT_ALL();
DEFINE ZERO org.wikimedia.analytics.kraken.pig.Zero();
LOG_FIELDS = LOAD_WEBREQUEST('/wmf/raw/webrequest/webrequest-wikipedia-mobile/dt=2013-05-01*');
LOG_FIELDS = FILTER LOG_FIELDS BY (x_cs != '-');
COUNT_1 = FOREACH (GROUP LOG_FIELDS ALL) GENERATE COUNT(LOG_FIELDS);
DUMP COUNT_1;
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment