Skip to content

Instantly share code, notes, and snippets.

@cbaenziger
cbaenziger / hbase_backup.md
Last active August 1, 2016 14:37 — forked from mlongob/hbase_backup.md
Hbase backup solutions

Introduction

This is a proposed procedure for Hbase table backups in a secure Hbase cluster. Requirements:

  • Live backups (cannot disable table or take hbase offline)
  • Self-Service (non-HBase user can backup/restore their own data)
  • Automatable procedure (Oozie controlled)
  • On secure cluster (cluster with world non-readable /hbase folder)
  • Supports off cluster backups ** Backup location might not have an installed instance of Hbase, just HDFS ** Backup location does not have credentials for hbase user

Chef-bach can be used to create a hadoop test cluster using virtual machines on an hypervisor host with enough resources. The resulting cluster will be a 4 node cluster with one of the nodes acting as the bootstrap node which will host a chef server.The other three nodes will be hadoop nodes. 2 out of 3 nodes will be master nodes and one node will be the worker node. The following are the steps to go about creating the test cluster. This has been tested on hypervisor hosts running Mac OS and Ubuntu.

  • Install curl on the hypervisor host
  • Install virtualbox on the hypervisor host
  • Install vagrant on the hypervisor host
  • Delete the default DHCP server inbuilt in virtualbox
  • Run sudo pkill -f VBox on the hypervisor host
  • Clone chef-bach repository onto the hypervisor host git clone https://github.com/bloomberg/chef-bach.git
  • rename chef-bach to chef-bcpc directory on the hypervisor host