Created
April 20, 2017 01:10
-
-
Save milindjagre/eb4a965a30cfee2e3398bb0897aac534 to your computer and use it in GitHub Desktop.
creating pig script for doing group operations
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
--GROUP OPERATION IN APACHE PIG | |
--loading weather data in weather relation | |
weather = LOAD '/hdpcd/input/post15/post15.csv' USING PigStorage(','); | |
--performing group operation based on station name | |
--station name is the first column in weather relation, therefore $0 | |
grouped_data = GROUP weather BY $0; | |
--generating output data with FOREACH...GENERATE command | |
--output contains station name as the group and rest of the columns in weather relation | |
output_data = FOREACH grouped_data GENERATE group,weather; | |
--storing the final output in HDFS | |
STORE output_data INTO '/hdpcd/output/post15/'; |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment