-
Note: You will now need to set the following env vars with your Cloudera paywall credentials: username paywall_password
-
Bring up 4 VMs imaged with RHEL/CentOS 6.x (e.g. node1-4 in this case)
-
On non-ambari nodes (e.g. nodes 2-4), install ambari-agents and point them to ambari node (e.g. node1 in this case)
export ambari_server=node1
export ambari_version=2.7.5.0
export username=myuser
export paywall_password=mypass
curl -sSL https://raw.githubusercontent.com/harshn08/ambari-bootstrap/master/ambari-bootstrap.sh | sudo -E sh
- On node2, install Mysql rpm. By default, this is where Hive will be installed. (Its also safe to run it on all the nodes)
#install MySql community rpm
sudo rpm -Uvh http://dev.mysql.com/get/mysql-community-release-el7-5.noarch.rpm
- On Ambari node, install ambari-server
export install_ambari_server=true
export ambari_version=2.7.5.0
export username=myuser
export paywall_password=mypass
curl -sSL https://raw.githubusercontent.com/harshn08/ambari-bootstrap/master/ambari-bootstrap.sh | sudo -E sh
sudo yum install -y git
- Confirm 4 agents were registered and agent is up
curl -u admin:admin -H X-Requested-By:ambari http://localhost:8080/api/v1/hosts
sudo service ambari-agent status
-
At this point you could install HDP via Ambari's install wizard by opening http://<AMBARI_IP>:8080 and following the prompts to customize as needed. If you prefer to just deploy a cluster with default settings follow below steps instead
-
Generate blueprint: you can generate BP and cluster file using Ambari recommendations API using these steps. For more details on the bootstrap scripts, see bootstrap script github
sudo yum install -y python-argparse mysql-connector-java*
sudo ambari-server setup --jdbc-db=mysql --jdbc-driver=/usr/share/java/mysql-connector-java.jar
git clone https://github.com/harshn08/ambari-bootstrap.git
export ambari_stack_version=3.1
export recommendation_strategy="ALWAYS_APPLY_DONT_OVERRIDE_CUSTOM_VALUES"
#Select the services to be deployed
#option A: for minimal services
#export ambari_services="HDFS MAPREDUCE2 YARN ZOOKEEPER HIVE"
#option B: for most services
#export ambari_services="ATLAS HBASE PHOENIX HDFS HIVE KAFKA LOGSEARCH AMBARI_INFRA PIG SPARK SQOOP MAPREDUCE2 STORM TEZ YARN ZOOKEEPER ZEPPELIN"
cd ~/ambari-bootstrap/deploy
cat << EOF > configuration-custom.json
{
"configurations" : {
"core-site": {
"hadoop.proxyuser.root.users" : "admin",
"fs.trash.interval": "4320"
},
"hdfs-site": {
"dfs.replication": "1",
"dfs.namenode.safemode.threshold-pct": "0.99"
},
"hive-site": {
"hive.server2.transport.mode" : "binary"
}
}
}
EOF
./deploy-recommended-cluster.bash
-
You can monitor the progress of the deployment via Ambari (e.g. http://node1:8080).
-
Once cluster is up you can use below automation to install a KDC and enable kerberos
- Set below env vars
export cluster_name=hdp ## replace this with your cluster name
export ambari_pass=BadPass#1 ## replace this with your Ambari password
export kdc_realm=CLOUDERA.COM ## replace this with your desired KDC name
- Now run below script to enable kerberos
cd /tmp
git clone https://github.com/crazyadmins/useful-scripts.git
cd useful-scripts/ambari/
cat << EOF > ambari.props
CLUSTER_NAME=${cluster_name}
AMBARI_ADMIN_USER=admin
AMBARI_ADMIN_PASSWORD=${ambari_pass}
AMBARI_HOST=$(hostname -f)
KDC_HOST=$(hostname -f)
REALM=${kdc_realm}
KERBEROS_CLIENTS=$(hostname -f)
EOF
cat ambari.props
chmod +x *.sh
./setup_kerberos.sh
Thanks, the ambari install script greatly saved my time!