Created
March 30, 2017 15:14
-
-
Save milindjagre/1e775cf6ab391326a49a0b3a13b26520 to your computer and use it in GitHub Desktop.
creating pig script for data transformation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
--this file is used for data transformation using Apache Pig | |
--we will load the data from post13.txt file and transform it | |
--LOAD command is used for loading the data in input_data pig relation | |
input_data = LOAD '/hdpcd/input/post13/post13.txt' USING PigStorage(','); | |
--input_data pig relation is transformed and we extract only | |
--first name, location and the program from the original input | |
flat_data = FOREACH input_data GENERATE $1 as fname, $3 as location, $5 as program; | |
--at last, we print the flat_data pig relation for confirmation | |
--DUMP command is used for printing the pig relation contents to the terminal window | |
dump flat_data; |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment