Skip to content

Instantly share code, notes, and snippets.

@abajwa-hw
abajwa-hw / hbase-indexing-solr.md
Last active December 1, 2020 18:09
Hbase indexing to solr in HDP 2.3

Hbase indexing to solr in HDP 2.3

  • Background:

The HBase Indexer provides the ability to stream events from HBase to Solr for near real time searching. The HBase indexer is included with HDPSearch as an additional service. The indexer works by acting as an HBase replication sink. As updates are written to HBase, the events are asynchronously replicated to the HBase Indexer processes, which in turn creates Solr documents and pushes them to Solr.

@abajwa-hw
abajwa-hw / Index documents using HDP Search.md
Last active October 13, 2015 12:13
Index documents using HDPSearch

Index documents using HDPSearch

Lab Overview

In this lab, we will learn to:

  • Configure Solr to store indexes in HDFS
  • Create a solr cluster of 2 solr instances running on port 8983 and 8984
  • Index documents in HDFS using the Hadoop connectors
  • Use Solr to search documents
@abajwa-hw
abajwa-hw / Sql on Hadoop.md
Last active May 10, 2016 14:55
Sql on Hadoop workshop

LAB

This lab is part of a 'Sql on Hadoop' webinar. The recording and slides can be found here

Purpose

How/when to use Hive vs Phoenix vs SparkSQL

@abajwa-hw
abajwa-hw / HDF workshop steps.md
Last active June 30, 2016 18:12
HDF workshop steps

HDF workshop

Setup already done

  • Download and import latest HDP 2.3 sandbox VM from http://hortonworks.com/sandbox and import into VMware Fusion
  • Deployed and install VNC ambari service (to be able to 'remote desktop' in and use eclipse)
    sudo git clone https://github.com/hortonworks-gallery/ambari-vnc-service.git /var/lib/ambari-server/resources/stacks/HDP/2.3/services/VNCSERVER
    
    
@abajwa-hw
abajwa-hw / settings.xml
Last active February 7, 2020 13:07
~/.m2/settings.xml
<settings xmlns="http://maven.apache.org/SETTINGS/1.0.0"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="http://maven.apache.org/SETTINGS/1.0.0
http://maven.apache.org/xsd/settings-1.0.0.xsd">
<localRepository/>
<interactiveMode/>
<usePluginRegistry/>
<offline/>
<pluginGroups/>
<servers/>
@abajwa-hw
abajwa-hw / Build LLAP on HDP 2.3.md
Last active November 9, 2017 07:59
Build LLAP on HDP 2.3

Build LLAP on HDP 2.3.x

  • Below are sample steps on how to build LLAP using Gopal's tez-autobuild repo on HDP 2.3.x cluster.
  • Note that due to HIVE-12446 for now a slightly older version of tez (with tag: 0.8.1-alpha) was built.

Steps

  • SSH into HDP cluster
  • Install C/C++ compiler
yum install -y gcc gcc-cpp gcc-c++
@abajwa-hw
abajwa-hw / contoso.md
Last active January 10, 2016 22:48
Import contoso to psql

EDW optimization and Single view lab

EDW optimization

  • Goal: demonstrate how you can bulk import data from EDW/RDBMS into Hive and then incrementally keep the Hive tables updated periodically

Pre-requisites

    1. Download contoso data set and MSSQL schema from here into /tmp on sandbox
@abajwa-hw
abajwa-hw / findjar.sh
Created April 20, 2016 22:26
Find jar
#find jar containing class org.apache.hive.jdbc.HiveStatement
find / -iname '*.jar' | xargs -i bash -c "jar -tvf {} | tr / . | grep org.apache.hive.jdbc.HiveStatement && echo {}"
@abajwa-hw
abajwa-hw / HDB-install.md
Last active July 20, 2016 18:36
Install HDB

Add HDB (HAWQ) to HDP 2.4.2 with Zeppelin

Goals:

  • Install a 4 node cluster running HDP 2.4.2 using Ambari 2.2.2.0 (including Zeppelin and HDB) using Ambari bootstrap via blueprints or Ambari install wizard
  • Configure HAWQ for Zeppelin
  • Configure Zeppelin for HAWQ
  • Run HAWQ queries via Zeppelin

Notes: