-
-
Save datawrangling/97d22e707ab7ca5fd140 to your computer and use it in GitHub Desktop.
Launching customized cloudera Hadoop EC2 cluster w_ latest testing repo release
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
1) Follow setup steps described here: | |
http://archive.cloudera.com/docs/_getting_started.html | |
(download http://cloudera-packages.s3.amazonaws.com/cloudera-for-hadoop-on-ec2-py-0.3.0-beta.tar.gz) | |
2) Modify "install_hadoop()" function in hadoop-ec2-init-remote.sh: | |
replace: | |
apt-get -y install pig${PIG_VERSION:+-${PIG_VERSION}} | |
with: | |
apt-get -y install hadoop-pig${PIG_VERSION:+-${PIG_VERSION}} | |
3) Start a custom cluster w/ testing release: | |
./hadoop-ec2 launch-cluster --user-packages 'r-base r-base-core r-base-dev r-base-html r-base-latex r-cran-date ant subversion python-rpy python-setuptools python-docutils python-support python-distutils-extra python-simplejson git-core s3cmd policykit' --env REPO=testing --env HADOOP_VERSION=0.20 --auto-shutdown 57 my-hadoop-cluster 20 |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment