Skip to content

Instantly share code, notes, and snippets.

@lxxstc
Last active December 17, 2015 05:19
Show Gist options
  • Save lxxstc/5556929 to your computer and use it in GitHub Desktop.
Save lxxstc/5556929 to your computer and use it in GitHub Desktop.
hadoop streaming script
hadoop jar ./contrib/streaming/hadoop-streaming-0.20.2-cdh3u4.jar -input /home/xiaoxu.lv/publish_fare/raw_data/blacketermmergeresults_zip_2013-05-10.txt -output /home/xiaoxu.lv/publish_fare/rule_detail/ -mapper extract_raw.py -file extract_raw.py -reducer cat -numReduceTasks 41
hadoop jar ./contrib/streaming/hadoop-streaming-0.20.2-cdh3u4.jar -input /home/xiaoxu.lv/publish_fare/rule_detail/ -output /home/xiaoxu.lv/publish_fare/rule_detail_count_finger/ -mapper "cut -f 1" -reducer "uniq -c" -numReduceTasks 41
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment