Skip to content

Instantly share code, notes, and snippets.

@acanalesg
Last active September 14, 2023 12:26
Show Gist options
  • Save acanalesg/671764b9add931089ab53efe6c6fa580 to your computer and use it in GitHub Desktop.
Save acanalesg/671764b9add931089ab53efe6c6fa580 to your computer and use it in GitHub Desktop.
hadoop streaming compression
yarn jar /usr/lib/hadoop-mapreduce/hadoop-streaming.jar -Dmapred.output.compress=true -D mapred.output.compression.codec=org.apache.hadoop.io.compress.BZip2Codec -input s3://acg-tests/dwell_plus/20161001 -output s3://acg-tests/bzip/dwell_plus/20161001/ -mapper /bin/cat -numReduceTasks 0
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment