Skip to content

Instantly share code, notes, and snippets.

View umbertogriffo's full-sized avatar

Umberto Griffo umbertogriffo

View GitHub Profile
@umbertogriffo
umbertogriffo / HBaseRestore.rb
Created February 19, 2016 15:04
This code restore the snapshots of all HBase tables saved using the script HBaseBackup.rb (https://gist.github.com/umbertogriffo/fe1bce24f8e9ee68c75f). Tested on CDH-5.4.4-1
# To execute script launch this command on shell: hbase shell HBaseRestore.rb
include Java
java_import org.apache.hadoop.hbase.HBaseConfiguration
java_import org.apache.hadoop.hbase.client.HBaseAdmin
java_import org.apache.hadoop.hbase.snapshot.ExportSnapshot
java_import org.apache.hadoop.hbase.TableExistsException
java_import org.apache.hadoop.util.ToolRunner
@umbertogriffo
umbertogriffo / HBaseBackup.rb
Last active March 24, 2023 15:01
This code takes a snapshot of all HBase tables, using the snapshot command (No file copies are performed). Tested on CDH-5.4.4-1
# Checking if the hbase.snapshot.enabled property in hbase-site.xml is set to true
# To execute script launch this command on shell: hbase shell HBaseBackup.rb
@clusterToSave = "hdfs:///srv2:8082/hbase"
# CHECK THE PATH OF HBase lib
@libjars = `ls /opt/cloudera/parcels/CDH-5.4.4-1.cdh5.4.4.p0.4/lib/hbase/*.jar | tr "\n" ","`
@ignore = [ /zipkin\..*/i, /.*_temp/i, /.*tmp/i, /test_.*/i, /.*_test/i, /.*_old/i ]
@mappers = "2"
include Java
@umbertogriffo
umbertogriffo / TwitterSentimentAnalysisAndN-gramWithHadoopAndHiveSQL.md
Last active May 11, 2021 13:22
Step by step Tutorial on Twitter Sentiment Analysis and n-gram with Hadoop and Hive SQL

PREREQUISITES

* Download JSON Serde at:
* http://files.cloudera.com/samples/hive-serdes-1.0-SNAPSHOT.jar
* and to renominate it as hive-serdes-1.0.jar
  • Add Jar to HIVE_AUX_JARS_PATH of HiveServer2:

    1. Copy the JAR files to the host on which HiveServer2 is running. Save the JARs to any directory you choose, and make a note of the path (create directory in /usr/share/).