Skip to content

Instantly share code, notes, and snippets.

@matthayes
Last active December 20, 2015 12:00
Show Gist options
  • Save matthayes/6128006 to your computer and use it in GitHub Desktop.
Save matthayes/6128006 to your computer and use it in GitHub Desktop.
An example in Pig of filtering data using conditional logic. Here a tuple is accepted if 'adj' equals either 'red' or 'blue'. As the number of conditions to check for grows this can be a pain to write.
data = LOAD 'input' using PigStorage(',') AS (what:chararray, adj:chararray);
dump data;
-- (roses,red)
-- (violets,blue)
-- (sugar,sweet)
data2 = FILTER data BY adj == 'red' OR adj == 'blue';
dump data2;
-- (roses,red)
-- (violets,blue)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment