Skip to content

Instantly share code, notes, and snippets.

@bugcy013
Last active December 21, 2015 06:57
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save bugcy013/91cd7767e5306d1703e5 to your computer and use it in GitHub Desktop.
Save bugcy013/91cd7767e5306d1703e5 to your computer and use it in GitHub Desktop.
#! /bin/bash
/bin/cp /etc/cloudera-scm-server/db.properties /etc/cloudera-scm-server/db.properties.bk
#---- real impact to prod below
#/etc/init.d/cloudera-scm-server stop
cat /etc/cloudera-scm-server/db.properties | grep 'com.cloudera.cmf.db.password'
#
# /etc/cloudera-scm-server/db.properties
#
# Auto-generated by initialize_embedded_db.sh
#
# 20140523-144214
#
# These are database settings for CM Manager
#
#com.cloudera.cmf.db.type=postgresql
#com.cloudera.cmf.db.host=localhost:7432
#com.cloudera.cmf.db.name=scm
#com.cloudera.cmf.db.user=scm
#com.cloudera.cmf.db.password=
rm -f /tmp/dataonly
pg_dump -h localhost -p 7432 --data-only --no-owner --no-acl --attribute-inserts --disable-dollar-quoting --no-tablespaces -U scm > /tmp/dataonly
#---- real impact to prod below
#etc/init.d/cloudera-scm-server-db stop
sed -i 's/SET /# SET /g' /tmp/dataonly
sed -i 's/ "timestamp", / `timestamp`, /g' /tmp/dataonly
sed -i 's/# SET search_path = public, pg_catalog;/SET foreign_key_checks = 0;/g' /tmp/dataonly
mysql -se "drop database scm"
mysql -se "create database scm"
mysql scm < cmf.nodata.dmp
mysql scm < /tmp/dataonly
#---- real impact to prod below
#mysql -se 'create user `scm`@`localhost` identified by ``; grant all privileges on `scm`.`*` to `scm`@`localhost`; flush privileges;'
#mysql -se 'create user `scm`@`10.%` identified by ``; grant all privileges on `scm`.`*` to `scm`@`10.%`; flush privileges;'
#sed -i 's/com.cloudera.cmf.db.type=postgresql/com.cloudera.cmf.db.type=mysql/' /etc/cloudera-scm-server/db.properties
#sed -i 's/com.cloudera.cmf.db.host=localhost:7432/com.cloudera.cmf.db.host=localhost:3306/' /etc/cloudera-scm-server/db.properties
Creating Databases for the Cloudera Manager Services
Create and configure MySQL databases for the Cloudera Management Services (Activity Monitor, Service Monitor, Host Monitor, and Report Manager), the Hive Metastore and Cloudera Navigator (optional). The databases must be configured to support UTF-8 character set encoding.
For performance reasons, it’s generally a good idea to have the Hive Metastore Server on the database server.
By default, Cloudera Manager uses Derby for Oozie’s database and SQLite for Hue’s database. You can configure MySQL after Cloudera Manager is installed.
1. Connect to MySQL on myhost1 as the root user.
$ mysql -u root -p
Enter password:
2. Create a database for the Activity Monitor.
mysql> create database amon DEFAULT CHARACTER SET utf8;
mysql> grant all on amon.* TO 'amon'@'%' IDENTIFIED BY 'passwd';
3. Create a database for the Service Monitor.
mysql> create database smon DEFAULT CHARACTER SET utf8;
mysql> grant all on smon.* TO 'smon'@'%' IDENTIFIED BY 'passwd';
4. Create a database for the Host Monitor.
mysql> create database hmon DEFAULT CHARACTER SET utf8;
mysql> grant all on hmon.* TO 'hmon'@'%' IDENTIFIED BY 'passwd';
5. Create a database for the Report Manager.
mysql> create database rman DEFAULT CHARACTER SET utf8;
mysql> grant all on rman.* TO 'rman'@'%' IDENTIFIED BY 'passwd';
6. Create a database for Cloudera Navigator (optional).
mysql> create database nav DEFAULT CHARACTER SET utf8;
mysql> grant all on nav.* TO 'nav'@'%' IDENTIFIED BY 'passwd';
7. Create a database for the Hive metastore. Create a separate metastore for each Hive service, if you have more than one.
mysql> create database hive DEFAULT CHARACTER SET utf8;
mysql> grant all on hive.* TO 'hive'@'%' IDENTIFIED BY 'passwd';
8. Create a database for Hue.
mysql> create database hue DEFAULT CHARACTER SET utf8;
mysql> grant all on hue.* to 'hue'@'%' IDENTIFIED BY 'passwd';
9. Create a database for Oozie.
mysql> create database oozie DEFAULT CHARACTER SET utf8;
mysql> grant all on oozie.* to 'oozie'@'%' IDENTIFIED BY 'passwd';
Installing the MySQL JDBC Connector
Install the JDBC connector on the Cloudera Manager Server host, as well as hosts to which you assign the Activity Monitor, Service Monitor, Host Monitor, Report Manager, Hive Metastore, and Cloudera Navigator roles. In this case, all are on same host.
Download JDBC Driver for MySQL from http://dev.mysql.com/get/Downloads/Connector-J/mysql-connector-java-5.1.35.tar.gz
# tar zxvf mysql-connector-java-5.1.35.tar.gz
# cp mysql-connector-java-5.1.35/mysql-connector-java-5.1.35-bin.jar /usr/share/java/
# ln -s /usr/share/java/mysql-connector-java-5.1.35-bin.jar /usr/share/java/mysql-connector-java.jar
Add the Cloudera Manager Repository
# cd /etc/yum.repos.d
# wget http://archive.cloudera.com/cm4/redhat/6/x86_64/cm/cloudera-manager.repo
Installing the Cloudera Manager Server
Install the Cloudera Manager Server either on the machine where the database is installed, or on a machine that has access to the database. This machine need not be a host in the cluster that you want to manage with Cloudera Manager. The Cloudera Manager Server does not require CDH4 to be installed on the same machine.
Note :
1. Cloudera Manager requires Hadoop to be installed on all hosts, but Hadoop must not be configured and must not be running.
2. The Activity Monitor in Cloudera Manager 4.0 requires the hue-plugins package to be installed on the JobTracker host, regardless of whether you are using Hue. If you are using Hue, the hue-plugins package must be installed on all hosts.
# yum -y install cloudera-manager-daemons cloudera-manager-server
Configuring the Database for the Cloudera Manager Server
As we are not using the embedded database, remove /etc/cloudera-scm-server/db.mgmt.properties.
Enable Cloudera Manager Server to connect to external database by running the script on Cloudera Manager Server host.
# /usr/share/cmf/schema/scm_prepare_database.sh mysql -h localhost -u abitra -p --scm-host localhost scm scm scm
Verifying that we can write to /etc/cloudera-scm-server
Creating SCM configuration file in /etc/cloudera-scm-server
Executing: /usr/java/jdk1.6.0_31/bin/java -cp /usr/share/java/mysql-connector-java.jar:/usr/share/cmf/schema/../lib/* com.cloudera.enterprise.dbutil.DbCommandExecutor /etc/cloudera-scm-server/db.properties com.cloudera.cmf.db.
[ main] DbCommandExecutor INFO Successfully connected to database.
All done, your SCM database is configured correctly!
Retrieving the Database Host, User Name, or Password
# cat /etc/cloudera-scm-server/db.properties
Start the Cloudera Manager Server
# service cloudera-scm-server start
Configuring Services
Login Cloudera Manager Admin Console at http://myhost1.example.com:7180/
The default credentials are Username: admin Password: admin
In Welcome to Cloudera page, click Continue.
Please restart Cloudera Manager to allow the new license to take effect. Existing clusters or services will be unaffected by the restart of Cloudera Manager.
# service cloudera-scm-server restart
Stopping cloudera-scm-server: [ OK ]
Starting cloudera-scm-server: [ OK ]
After the Cloudera Manager server restarts, login again.
The Cloudera Manager will enable to choose the packages for the below services. Click Continue.
· Apache Hadoop (Common, HDFS, MapReduce, YARN)
· Apache HBase
· Apache ZooKeeper
· Apache Oozie
· Apache Hive
· Hue (Apache licensed)
· Apache Flume
· Cloudera Impala (Apache licensed)
· Apache Sqoop
· Cloudera Search (Apache licensed)
Click Continue to proceed with the installation.
Specify hosts for your CDH cluster installation including Cloudera Manager Server host and then click Search to find the cluster hosts.
myhost2, myhost3, myhost4, myhost5
Cloudera Manager identified the hosts to configure them for CDH.
Select the hosts where you want to install CDH and click Continue.
On Cluster Installation page, select repository type you want to use for the installation. Choose Use Parcels. Under More Options, you can add repository for previous versions.
Select the specific releases of Impala and Solr to install on your hosts. If you do not want to install those products, choose None.
Select the specific release of Cloudera Manager Agent you want to install on your hosts.
Cloudera Manager and Cloudera Distribution of Hadoop (CDH) are comprised of a set of services. These services interact among each other and use databases to complete tasks.
Provide SSH Login credentials for root.
Cloudera manager daemons, cloudera manager agent and jdk get installed on the previous selected hosts. Click Continue.
Selected parcels get installed on the previous selected hosts. Click Continue.
Choose cluster services you want to install. You can choose one of the standard combinations: Core
Hadoop, Real-Time Delivery (previously known as HBase Services), Real-Time Query (which includes HDFS, Hive and Impala), or All Services; these combinations take into account the dependencies between the Hadoop services. Alternatively, you can choose Custom Services, and select the services individually.
Note:
Some services depend on others; for example, HBase requires HDFS and ZooKeeper.
The Cloudera Management Services, which are added to each package, are Cloudera Manager processes that run to support monitoring and management features in Cloudera Manager. Cloudera Navigator is a system to support enforcement of compliance with company policies for data stored in a Hadoop Distributed File System (HDFS) deployment.
After the selection of the services, customize the role assignments for each node in the cluster. Click Inspect Role Assignments.
On the Database Setup page, enter the information for the Service Monitor, Activity Monitor, Host Monitor, Report Manager, and Hive metastore databases. Click Test Connection to confirm that Cloudera Manager can communicate with the databases. This transaction takes two heartbeats to complete (about 30 seconds with the default heartbeat interval). If the test succeeds in all cases, click Continue; otherwise check and correct the information you have provided for the databases and then try the test again.
On Review Configuration Changes page, confirm the settings entered for file system paths, such as the NameNode Data Directory and the DataNode Data Directory. Supply the name of the mail server (it can be localhost), the mail server user, and the mail recipients.
The wizard starts the services on your cluster.
When all of the services are started, click Continue.
Start the Cloudera Manager Agent
# service cloudera-scm-agent start
Starting cloudera-scm-agent: [ OK ]
Change the Default Administrator Password
Change the default administrator password as soon as beginning to use Cloudera Manager.
1. From the Administration tab, select Users.
2. Click the Change Password button next to the admin account.
3. Enter a new password twice and then click Update.
Specifying the Racks for Hosts
Cloudera Manager includes internal rack awareness scripts, but you must specify the racks where the hosts in your cluster are located. If your cluster contains more than 10 hosts, Cloudera recommends that you specify the rack for each host. HDFS and MapReduce will automatically use the racks you specify.
1. Click the Hosts tab.
2. Select the host(s) for a particular rack.
3. From Actions for Selected tab, click Assign Rack.
4. Enter new rack name such as /rack1 and then click Confirm.
After assigning racks, restart affected services.
Checking Host Heartbeats
By default, every Agent must heartbeat successfully every 15 seconds.
1. Click the Hosts tab.
2. See a list of all the hosts along with the value of Last Heartbeat.
Enabling Oozie Web Console
1. Download ext-2.2.zip from http://extjs.com/deploy/ext-2.2.zip
2. Extract the contents of the file to /usr/lib/oozie/libext on the Oozie server.
3. On Oozie Service page, select Configuration > View and Edit.
4. Check Enable Oozie Server Web Console.
5. Click on Save Changes.
6. Restart the Oozie Service.
Configuring MySQL for Oozie
By default, Cloudera Manager uses Derby for Oozie Database.
1. On Oozie Service page, select Configuration > View and Edit.
2. In the Category Pane, expand Oozie Server (Default) and click Database.
3. Specify the settings for Oozie Server Database Type, Oozie Server Database Name, Oozie Server Database Host, Oozie Server Database User, and Oozie Server Database Password.
4. Create symlink for MySQL connector.
# ln -s /usr/share/java/mysql-connector-java.jar /var/lib/oozie/mysql-connector-java.jar
5. Start the Oozie Service.
Configuring MySQL for Hue
By default Cloudera Manager uses SQLite for Hue Database.
1. From The Hue service instance page, click Actions > Stop. Confirm you want to stop the service by clicking Stop.
2. Click Configuration > View and Edit. In the Category Pane, expand Service-Wide and click Database.
3. Specify the settings for Hue Database Type, Hue Database Hostname, Hue Database Port, Hue Database Username, Hue Database Password, and Hue Database Name.
4. Restart the Hue service.
Changing the CM database is a little annoying, but very possible. Here's
the steps, which involve cluster downtime:
1) Stop everything through CM UI
2) save to a file the JSON from http://
<host>:7180/api/v4/cm/deployment?view=export
3) stop CM (sudo service cloudera-scm-server stop)
4) stop all CM agents on all hosts in your cluster (sudo service
cloudera-scm-agent stop)
5) back up and then edit /etc/cloudera-scm-server/db.properties to point to
your new mysql, empty schema
6) start CM (sudo service cloudera-scm-server start)
7) log in to CM as admin/admin. It will think it is brand new (since it's
on a new db)
8) HTML PUT the data from step 2 to the same url,
http://<host>:7180/api/v4/cm/deployment?view=export.
You can use curl or a browser plugin like POSTMAN to do this.
8a) You should see all of your cluster configuration re-appear, and all of
your hosts should be present
9) start all CM agents on all hosts in your cluster (sudo service
cloudera-scm-agent start)
10) start the cluster and management service in the CM UI
11) When you are satisfied everything is working, stop the embedded db
(sudo service cloudera-scm-server-db stop)
If anything goes wrong, your CM configuration data is still in the embedded
postgresql and you can easily revert by replacing db.properties with the
backup you took in step 5.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment