Skip to content

Instantly share code, notes, and snippets.

@vorpal56
Last active November 24, 2020 18:55
Show Gist options
  • Save vorpal56/f2d76e6f95c09a88c83c738a976857db to your computer and use it in GitHub Desktop.
Save vorpal56/f2d76e6f95c09a88c83c738a976857db to your computer and use it in GitHub Desktop.
Run Hadoop MapReduce Job on AWS EMR
sudo chmod 400 <key_name>.pem
ssh -i <key_name>.pem hadoop@<ec2-ip>
sudo yum install git-core
git clone <repo>
cd <repo>
sudo mkdir compiled
sudo javac <some_directory>/*.java -cp `hadoop classpath` -d compiled/
cd compiled
sudo jar -cvf <application_name>.jar ./<some_directory>/*.class
hadoop fs -mkdir /input
hadoop fs -put ~/<repo>/<input_directory>/*.gz
hadoop jar <application_name>.jar <pkg_name>.<class_name> /input /output
hadoop fs -copyToLocal /output ~/<repo>/
# ----------EXAMPLE----------
sudo yum install git-core
git clone https://github.com/vorpal56/cp422-avg-temp.git
cd cp422-avg-temp
sudo mkdir compiled
sudo javac avgTemp/*.java -cp `hadoop classpath` -d compiled/
cd compiled
sudo jar -cvf AvgTemperature.jar ./avgTemp/*.class
hadoop fs -mkdir /input
hadoop fs -put ~/cp422-avg-temp/input/1904.gz /input/1904.gz
hadoop fs -put ~/cp422-avg-temp/input/1905.gz /input/1905.gz
hadoop jar AvgTemperature.jar avgTemp.AvgTemperature /input /output
hadoop fs -copyToLocal /output ~/cp-422-avg-temp/
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment