Skip to content

Instantly share code, notes, and snippets.

@abajwa-hw
abajwa-hw / Index documents using HDP Search.md
Last active October 13, 2015 12:13
Index documents using HDPSearch

Index documents using HDPSearch

Lab Overview

In this lab, we will learn to:

  • Configure Solr to store indexes in HDFS
  • Create a solr cluster of 2 solr instances running on port 8983 and 8984
  • Index documents in HDFS using the Hadoop connectors
  • Use Solr to search documents
@abajwa-hw
abajwa-hw / contoso.md
Last active January 10, 2016 22:48
Import contoso to psql

EDW optimization and Single view lab

EDW optimization

  • Goal: demonstrate how you can bulk import data from EDW/RDBMS into Hive and then incrementally keep the Hive tables updated periodically

Pre-requisites

    1. Download contoso data set and MSSQL schema from here into /tmp on sandbox
@abajwa-hw
abajwa-hw / findjar.sh
Created April 20, 2016 22:26
Find jar
#find jar containing class org.apache.hive.jdbc.HiveStatement
find / -iname '*.jar' | xargs -i bash -c "jar -tvf {} | tr / . | grep org.apache.hive.jdbc.HiveStatement && echo {}"
@abajwa-hw
abajwa-hw / Sql on Hadoop.md
Last active May 10, 2016 14:55
Sql on Hadoop workshop

LAB

This lab is part of a 'Sql on Hadoop' webinar. The recording and slides can be found here

Purpose

How/when to use Hive vs Phoenix vs SparkSQL

@abajwa-hw
abajwa-hw / HAWQ demo.json
Last active June 8, 2016 02:15
Zeppelin notebook for HAWQ demo
{"paragraphs":[{"title":"Create HAWQ table and generate data series","text":"%psql.sql\ndrop table if exists tt;\ncreate table tt (i int);\ninsert into tt select generate_series(1,1000000);","dateUpdated":"Jun 7, 2016 7:12:59 PM","config":{"colWidth":12,"editorMode":"ace/mode/scala","title":true,"graph":{"mode":"table","height":300,"optionOpen":false,"keys":[],"values":[],"groups":[],"scatter":{}},"enabled":true},"settings":{"params":{},"forms":{}},"jobName":"paragraph_1465351928921_1500086996","id":"20160603-083343_921281900","result":{"code":"SUCCESS","type":"TABLE","msg":"Update Count\n0\n","comment":"","msgTable":[[{"value":"0"}]],"columnNames":[{"name":"Update Count","index":0,"aggr":"sum"}],"rows":[["0"]]},"dateCreated":"Jun 7, 2016 7:12:08 PM","status":"FINISHED","progressUpdateIntervalMs":500,"$$hashKey":"object:863","dateFinished":"Jun 7, 2016 7:13:00 PM","dateStarted":"Jun 7, 2016 7:12:59 PM","focus":true},{"title":"Calculate average of subset of data","text":"%psql.sql\nselect avg(i) from tt where
@abajwa-hw
abajwa-hw / createHDBsandbox.sh
Last active June 8, 2016 15:54
Automated build script to create HAWQ (HDB) single VM sandbox on HDP
###########################################################################################################################
##HDB on HDP sandbox setup script
###Pre-reqs:
#- Laptop with at least 10-12 GB RAM (mine has 16 GB)
#- ISO image of Centos 6.7 or later downloaded from [here](http://isoredirect.centos.org/centos/6/isos/x86_64/).
# - In my case, I used CentOS-6.7-x86_64-bin-DVD1.iso.
##### Setup Centos 6.7 or later on VM
#- Start a CentOS VM using ISO
@abajwa-hw
abajwa-hw / HDF workshop steps.md
Last active June 30, 2016 18:12
HDF workshop steps

HDF workshop

Setup already done

  • Download and import latest HDP 2.3 sandbox VM from http://hortonworks.com/sandbox and import into VMware Fusion
  • Deployed and install VNC ambari service (to be able to 'remote desktop' in and use eclipse)
    sudo git clone https://github.com/hortonworks-gallery/ambari-vnc-service.git /var/lib/ambari-server/resources/stacks/HDP/2.3/services/VNCSERVER
    
    
@abajwa-hw
abajwa-hw / setup_nifi_ssl.md
Created July 15, 2016 00:40
Setup SSL for clustered Nifi
  • Create and distribute certs on HDF cluster
#download toolkit
wget https://hipchat.hortonworks.com/files/1/2055/bT1LbKB8SS26X9t/nifi-toolkit-1.0.0-SNAPSHOT-bin.zip

#create nifi certs dir under ambari-server resources dir
mkdir /var/lib/ambari-server/resources/host_scripts/nifi-certs

#generate certs using toolkit into ambari-server resources dir
@abajwa-hw
abajwa-hw / HDB-install.md
Last active July 20, 2016 18:36
Install HDB

Add HDB (HAWQ) to HDP 2.4.2 with Zeppelin

Goals:

  • Install a 4 node cluster running HDP 2.4.2 using Ambari 2.2.2.0 (including Zeppelin and HDB) using Ambari bootstrap via blueprints or Ambari install wizard
  • Configure HAWQ for Zeppelin
  • Configure Zeppelin for HAWQ
  • Run HAWQ queries via Zeppelin

Notes:

@abajwa-hw
abajwa-hw / manual_nifi_rpm_install.sh
Last active August 13, 2016 02:15
Manual rpm install for Nifi
build=447
tee /etc/yum.repos.d/HDF.repo > /dev/null << EOF
[HDF-2.0]
name=HDF-2.0
baseurl=http://s3.amazonaws.com/dev.hortonworks.com/HDF/centos6/2.x/BUILDS/2.0.0.0-$build
gpgcheck=0
path=/
enabled=1
EOF