Create a gist now

Instantly share code, notes, and snippets.

Hello World for HBase
* This brief HELLO WORLD Java program is meant to enable you to very quickly
* gain a rudimentary, hands-on understanding of how data (and metadata) is
* stored and retrieved in HBase via the "client API".
* ================
* For those coming to the HBase world with previous experience in traditional
* RDBMS databases, it is essential to realize that Tables, Rows, and Columns
* in HBase, while bearing some resemblance to their namesakes in the RDBMS
* world, differ markedly in their structures and functionality.
* **Column Families**
* As you can see in the code below, when you use the Admin#createTable method,
* besides providing a TableName, you also must specify at least one "Column
* Family" (denoted in the code by the class HColumnDescriptor).
* In HBase, Columns are all grouped by Column Family, with all Columns in a
* family being physically stored together. Theoretically, you could have a
* large number of Column Families, but the present HBase architecture
* actually has a practical limitation of no more than three or four per Table.
* **Versioning**
* In the code below, a "maxVersions" value of 3 is assigned to the
* Column Family, which means that versioning has been enabled for all Columns
* in the family: when a Column is updated, the 2 most recent *previous* values
* for that Column are still retrievable, each designated by a timestamp.
* These individual versioned instances are sometimes referred to as the Cells
* of a Column. The retrieval of multiple versions (Cells) of the same Column is
* performed below in the #getAndPrintAllCellVersions method.
* **Columns**
* It is important to note (in the most striking departure from RDBMS norms)
* that Columns themselves are NOT part of the Table definition. Columns are
* "defined" on-the-fly as each row is <put> (i.e., inserted/updated) into the
* database. There is also NO datatyping of each Column: HBase accepts any
* byte-array of any length/format you wish to store in any Column. This means
* NAMES AND DATATYPES. In the RDBMS world, the database (i.e., database
* administrator) manages column metadata; in the HBase world, the application
* (i.e., application designer/programmer) manages column metadata.
* **Rows**
* Rows are inserted, accessed, and physically ordered exclusively by Row ID
* (the conceptual equivalent of an RDBMS primary key). When a "scan" is
* performed to access multiple contiguous rows, those rows will always be
* returned in Row ID order (either ascending or descending).
* ==============================
* Importantly, your ability to run this code requires that you have successfully
* installed and started a standalone implementation of HBase on the machine
* on which this program is to be run.
* The recommended steps to take to run this program are:
* (1) Install a "standalone" configuration of the current stable release of
* HBase on your machine following the instructions provided at:
* (If you are installing on a Windows machine, it is strongly recommended
* that you NOT bother trying to do an installation using the documented
* Cygwin option [which has proven to be faulty and is apparently not
* kept up-to-date with new releases of HBase], but instead install and
* run a virtual Unix environment [e.g., Ubuntu] in a virtual machine
* such as VirtualBox, and install HBase in that environment.)
* (2) Copy this code into a new project in your favorite IDE, set up the
* CLASSPATH as documented below, and use this code as your launchpad into
* effective utilization of the HBase Client API. Run and modify this code
* as extensively as you need to in order to build and deepen your
* understanding of how to store and retrieve data (and metadata!) in HBase.
* Refer to the HBase javadocs ( )
* to extend this code and explore functionality not demonstrated in the
* code below.
* This code was developed in coordination with HBase release;
* compatibility with subsequent releases is hoped for, but by no means
* guaranteed.
* =========================
* To fulfill CLASSPATH requirements to compile/run this program:
* -- the CLASSPATH must include the directory in which hbase-site.xml (i.e.,
* the HBase startup parameters file) is stored for your currently-running
* instance of HBase (e.g., '/usr/local/hbase/hbase-').
* [In NetBeans, this would be set in Project Properties/Libraries/Run.]
* -- the CLASSPATH should also include the HBase library (e.g. "HBase_1.0.1.1"
* [In NetBeans, you can include this library in your project's
* "Compile-time Libraries" list.]
package org.prettygoodexamples.hellohbase;
import java.util.Map.Entry;
import java.util.NavigableMap;
import org.apache.hadoop.hbase.HColumnDescriptor;
import org.apache.hadoop.hbase.HTableDescriptor;
import org.apache.hadoop.hbase.NamespaceDescriptor;
import org.apache.hadoop.hbase.NamespaceNotFoundException;
import org.apache.hadoop.hbase.TableName;
import org.apache.hadoop.hbase.client.Admin;
import org.apache.hadoop.hbase.client.Connection;
import org.apache.hadoop.hbase.client.ConnectionFactory;
import org.apache.hadoop.hbase.client.Delete;
import org.apache.hadoop.hbase.client.Get;
import org.apache.hadoop.hbase.client.Put;
import org.apache.hadoop.hbase.client.Result;
import org.apache.hadoop.hbase.client.Table;
import org.apache.hadoop.hbase.util.Bytes;
* Successful running of this application requires access to an active instance
* of HBase. For install instructions for a standalone instance of HBase, please
* refer to
public final class HelloHBase {
protected static final String MY_NAMESPACE_NAME = "myTestNamespace";
static final TableName MY_TABLE_NAME = TableName.valueOf("myTestTable");
static final byte[] MY_COLUMN_FAMILY_NAME = Bytes.toBytes("cf");
static final byte[] MY_FIRST_COLUMN_QUALIFIER
= Bytes.toBytes("myFirstColumn");
static final byte[] MY_SECOND_COLUMN_QUALIFIER
= Bytes.toBytes("mySecondColumn");
static final byte[] MY_ROW_ID = Bytes.toBytes("rowId01");
public static void main(final String[] args) throws IOException {
final boolean deleteAllAtEOJ = true;
* ConnectionFactory#createConnection() automatically looks for
* hbase-site.xml (HBase configuration parameters) on the system's
* CLASSPATH, to enable creation of Connection to HBase via Zookeeper.
try (Connection connection = ConnectionFactory.createConnection();
Admin admin = connection.getAdmin()) {
admin.getClusterStatus(); // assure connection successfully established
System.out.println("\n*** Hello HBase! -- Connection has been "
+ "established via Zookeeper!!\n");
System.out.println("Getting a Table object for [" + MY_TABLE_NAME
+ "] with which to perform CRUD operations in HBase.");
try (Table table = connection.getTable(MY_TABLE_NAME)) {
if (deleteAllAtEOJ) {
if (deleteAllAtEOJ) {
* Invokes Admin#createNamespace and Admin#createTable to create a namespace
* with a table that has one column-family.
* @param admin Standard Admin object
* @throws IOException If IO problem encountered
static void createNamespaceAndTable(final Admin admin) throws IOException {
if (!namespaceExists(admin, MY_NAMESPACE_NAME)) {
System.out.println("Creating Namespace [" + MY_NAMESPACE_NAME + "].");
if (!admin.tableExists(MY_TABLE_NAME)) {
System.out.println("Creating Table [" + MY_TABLE_NAME.getNameAsString()
+ "], with one Column Family ["
+ Bytes.toString(MY_COLUMN_FAMILY_NAME) + "].");
admin.createTable(new HTableDescriptor(MY_TABLE_NAME)
.addFamily(new HColumnDescriptor(MY_COLUMN_FAMILY_NAME)));
* Invokes Table#put to store a row (with two new columns created 'on the
* fly') into the table.
* @param table Standard Table object (used for CRUD operations).
* @throws IOException If IO problem encountered
static void putRowToTable(final Table table) throws IOException {
table.put(new Put(MY_ROW_ID).addColumn(MY_COLUMN_FAMILY_NAME,
System.out.println("Row [" + Bytes.toString(MY_ROW_ID)
+ "] was put into Table ["
+ table.getName().getNameAsString() + "] in HBase;\n"
+ " the row's two columns (created 'on the fly') are: ["
+ Bytes.toString(MY_COLUMN_FAMILY_NAME) + ":"
+ "] and [" + Bytes.toString(MY_COLUMN_FAMILY_NAME) + ":"
+ Bytes.toString(MY_SECOND_COLUMN_QUALIFIER) + "]");
* Invokes Table#get and prints out the contents of the retrieved row.
* @param table Standard Table object
* @throws IOException If IO problem encountered
static void getAndPrintRowContents(final Table table) throws IOException {
Result row = table.get(new Get(MY_ROW_ID));
System.out.println("Row [" + Bytes.toString(row.getRow())
+ "] was retrieved from Table ["
+ table.getName().getNameAsString()
+ "] in HBase, with the following content:");
for (Entry<byte[], NavigableMap<byte[], byte[]>> colFamilyEntry
: row.getNoVersionMap().entrySet()) {
String columnFamilyName = Bytes.toString(colFamilyEntry.getKey());
System.out.println(" Columns in Column Family [" + columnFamilyName
+ "]:");
for (Entry<byte[], byte[]> columnNameAndValueMap
: colFamilyEntry.getValue().entrySet()) {
System.out.println(" Value of Column [" + columnFamilyName + ":"
+ Bytes.toString(columnNameAndValueMap.getKey()) + "] == "
+ Bytes.toString(columnNameAndValueMap.getValue()));
* Checks to see whether a namespace exists.
* @param admin Standard Admin object
* @param namespaceName Name of namespace
* @return true If namespace exists
* @throws IOException If IO problem encountered
static boolean namespaceExists(final Admin admin, final String namespaceName)
throws IOException {
try {
} catch (NamespaceNotFoundException e) {
return false;
return true;
* Invokes Table#delete to delete test data (i.e. the row)
* @param table Standard Table object
* @throws IOException If IO problem is encountered
static void deleteRow(final Table table) throws IOException {
System.out.println("Deleting row [" + Bytes.toString(MY_ROW_ID)
+ "] from Table ["
+ table.getName().getNameAsString() + "].");
table.delete(new Delete(MY_ROW_ID));
* Invokes Admin#disableTable, Admin#deleteTable, and Admin#deleteNamespace to
* disable/delete Table and delete Namespace.
* @param admin Standard Admin object
* @throws IOException If IO problem is encountered
static void deleteNamespaceAndTable(final Admin admin) throws IOException {
if (admin.tableExists(MY_TABLE_NAME)) {
System.out.println("Disabling/deleting Table ["
+ MY_TABLE_NAME.getNameAsString() + "].");
admin.disableTable(MY_TABLE_NAME); // Disable a table before deleting it.
if (namespaceExists(admin, MY_NAMESPACE_NAME)) {
System.out.println("Deleting Namespace [" + MY_NAMESPACE_NAME + "].");
2016-04-15 09:43:32,381 WARN [main] util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
2016-04-15 09:43:32,974 INFO [main] zookeeper.RecoverableZooKeeper: Process identifier=hconnection-0x704a52ec connecting to ZooKeeper ensemble=localhost:2181
2016-04-15 09:43:32,984 INFO [main] zookeeper.ZooKeeper: Client environment:zookeeper.version=3.4.6-1569965, built on 02/20/2014 09:09 GMT
2016-04-15 09:43:32,985 INFO [main] zookeeper.ZooKeeper: Client
2016-04-15 09:43:32,986 INFO [main] zookeeper.ZooKeeper: Client environment:java.version=1.8.0_45
2016-04-15 09:43:32,986 INFO [main] zookeeper.ZooKeeper: Client environment:java.vendor=Oracle Corporation
2016-04-15 09:43:32,986 INFO [main] zookeeper.ZooKeeper: Client environment:java.home=/home/dv/jdk1.8.0_45/jre
2016-04-15 09:43:32,986 INFO [main] zookeeper.ZooKeeper: Client environment:java.class.path=/home/dv/NetBeansProjects/column-manager/target/classes:/home/dv/.m2/repository/org/apache/hbase/hbase-common/
2016-04-15 09:43:32,993 INFO [main] zookeeper.ZooKeeper: Client environment:java.library.path=/home/dv/jdk1.8.0_45/jre/lib/amd64:/home/dv/jdk1.8.0_45/jre/lib/i386::/usr/java/packages/lib/amd64:/usr/lib64:/lib64:/lib:/usr/lib
2016-04-15 09:43:32,994 INFO [main] zookeeper.ZooKeeper: Client
2016-04-15 09:43:32,994 INFO [main] zookeeper.ZooKeeper: Client environment:java.compiler=<NA>
2016-04-15 09:43:32,994 INFO [main] zookeeper.ZooKeeper: Client
2016-04-15 09:43:32,994 INFO [main] zookeeper.ZooKeeper: Client environment:os.arch=amd64
2016-04-15 09:43:32,994 INFO [main] zookeeper.ZooKeeper: Client environment:os.version=3.13.0-85-generic
2016-04-15 09:43:32,994 INFO [main] zookeeper.ZooKeeper: Client
2016-04-15 09:43:32,994 INFO [main] zookeeper.ZooKeeper: Client environment:user.home=/home/dv
2016-04-15 09:43:32,995 INFO [main] zookeeper.ZooKeeper: Client environment:user.dir=/home/dv/NetBeansProjects/column-manager
2016-04-15 09:43:32,996 INFO [main] zookeeper.ZooKeeper: Initiating client connection, connectString=localhost:2181 sessionTimeout=90000 watcher=hconnection-0x704a52ec0x0, quorum=localhost:2181, baseZNode=/hbase
2016-04-15 09:43:33,032 INFO [main-SendThread(localhost:2181)] zookeeper.ClientCnxn: Opening socket connection to server localhost/ Will not attempt to authenticate using SASL (unknown error)
2016-04-15 09:43:33,049 INFO [main-SendThread(localhost:2181)] zookeeper.ClientCnxn: Socket connection established to localhost/, initiating session
2016-04-15 09:43:33,079 INFO [main-SendThread(localhost:2181)] zookeeper.ClientCnxn: Session establishment complete on server localhost/, sessionid = 0x154175b89b60008, negotiated timeout = 40000
*** Hello HBase! -- Connection has been established via Zookeeper!!
Creating Table [myTestTable], with one Column Family [cf].
Getting a Table object for [myTestTable] with which to perform CRUD operations in HBase.
Row [rowId01] was put into Table [myTestTable] in HBase;
the row's two columns (created 'on the fly') are: [cf:myFirstColumn] and [cf:mySecondColumn]
Row [rowId01] was retrieved from Table [myTestTable] in HBase, with the following content:
Columns in Column Family [cf]:
Value of Column [cf:myFirstColumn] == Hello
Value of Column [cf:mySecondColumn] == World!
Deleting row [rowId01] from Table [myTestTable].
Disabling/deleting Table [myTestTable].
2016-04-15 09:43:35,002 INFO [main] client.HBaseAdmin: Started disable of myTestTable
2016-04-15 09:43:36,252 INFO [main] client.HBaseAdmin: Disabled myTestTable
2016-04-15 09:43:36,536 INFO [main] client.HBaseAdmin: Deleted myTestTable
Deleting Namespace [myTestNamespace].
2016-04-15 09:43:36,582 INFO [main] client.ConnectionManager$HConnectionImplementation: Closing master protocol: MasterService
2016-04-15 09:43:36,584 INFO [main] client.ConnectionManager$HConnectionImplementation: Closing zookeeper sessionid=0x154175b89b60008
2016-04-15 09:43:36,591 INFO [main] zookeeper.ZooKeeper: Session: 0x154175b89b60008 closed
2016-04-15 09:43:36,591 INFO [main-EventThread] zookeeper.ClientCnxn: EventThread shut down
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment