Skip to content

Instantly share code, notes, and snippets.

@greeness
Created November 10, 2011 17:12
Show Gist options
  • Save greeness/1355448 to your computer and use it in GitHub Desktop.
Save greeness/1355448 to your computer and use it in GitHub Desktop.
Install whirr cdh3 release
# launch an ec2 instance with lucid (ubuntu 10.04) e.g. ami-ad36fbc4
# ssh to the machine
################################################################
# install java
# https://ccp.cloudera.com/display/CDHDOC/Java+Development+Kit+Installation
# RELEASE=lucid, which you can find by running lsb_release -c.
################################################################
$ sudo add-apt-repository "deb http://archive.canonical.com/ lucid partner"
$ sudo apt-get update
$ sudo sudo apt-get install sun-java6-jdk
################################################################
# add CDH3 repository
# https://ccp.cloudera.com/display/CDHDOC/CDH3+Installation#CDH3Installation-AddingaDebianRepository
################################################################
$ sudo vi /etc/apt/sources.list.d/cloudera.list
# add the following contents:
deb http://archive.cloudera.com/debian lucid-cdh3 contrib
deb-src http://archive.cloudera.com/debian lucid-cdh3 contrib
#add a repository key. Add the Cloudera Public GPG Key to your repository by executing the following command:
$ curl -s http://archive.cloudera.com/debian/archive.key | sudo apt-key add -
################################################################
# install hadoop, pig, whirr
################################################################
$ sudo apt-get install hadoop-0.20
$ sudo apt-get install hadoop-pig
$ sudo apt-get install whirr
# whirr config
# https://ccp.cloudera.com/display/CDHDOC/Whirr+Installation
################################################################
# add ENV VAR to ~/.bashrc or ~/.bash_profile
################################################################
export JAVA_HOME=/usr/lib/jvm/sun-java-6
export AWS_ACCESS_KEY_ID=
export AWS_SECRET_ACCESS_KEY=
export HADOOP_CONF_DIR=$HOME/.whirr/hadoop
export PIG_CLASSPATH=${HADOOP_CONF_DIR}
################################################################
# Generating an SSH Key Pair
################################################################
$ ssh-keygen -t rsa -P ''
################################################################
# Hadoop Cluster on AWS EC2
################################################################
# Defining a Whirr Cluster
# locate the example hadoop-ec2.properties and modify
$ vi hadoop-ec2.properties
whirr.cluster-name=hadoop
whirr.instance-templates=1 hadoop-namenode+hadoop-jobtracker,4 hadoop-datanode+hadoop-tasktracker
whirr.hadoop-install-function=install_cdh_hadoop
whirr.hadoop-configure-function=configure_cdh_hadoop
whirr.provider=aws-ec2
whirr.identity=${env:AWS_ACCESS_KEY_ID}
whirr.credential=${env:AWS_SECRET_ACCESS_KEY}
whirr.hardware-id=m1.large
whirr.image-id=us-east-1/ami-3f6ca156
whirr.location-id=us-east-1
whirr.private-key-file=/home/ubuntu/.ssh/id_rsa_whirr
whirr.public-key-file=${whirr.private-key-file}.pub
################################################################
$ whirr launch-cluster --config hadoop-ec2.properties
#
#####
# install dumo
$ wget -O ez_setup.py http://bit.ly/ezsetup
$ sudo python ez_setup.py dumbo
####
# install some useful python modules
$ sudo apt-get install python-scipy python-numpy
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment