Skip to content

Instantly share code, notes, and snippets.

View cjmatta's full-sized avatar

Christopher Matta cjmatta

View GitHub Profile
@cjmatta
cjmatta / ec2cluster.md
Last active August 29, 2015 14:03
clustershell groups.conf.d configuration to parse a cluster.hosts file

In the /etc/clustershell/groups.conf file I like to put the following definitions:

[mygroup]
map: sed -n '1,/$GROUP/d;/\[/,$d;/^$/d;p' /etc/clustershell/groups.conf.d/mygroup_poc_cluster.hosts | awk '{print $1}'
all: grep -v "^\[" /etc/clustershell/groups.conf.d/mygroup_poc_cluster.hosts | grep -v ^$
list: grep "^\[" /etc/clustershell/groups.conf.d/mygroup_poc_cluster.hosts | sed -e "s/\[//" -e "s/\]//"

The cluster.hosts file is generated from the mapr-ansible-roles and looks like this:

@cjmatta
cjmatta / pop_percent_by_state.json
Last active August 29, 2015 14:06
Percentage of population by state - 2009 Census
[
"AL": 0.015802252,
"AK": 0.00222773,
"AZ": 0.018231104,
"AR": 0.009499616,
"CA": 0.12035896,
"CO": 0.015284031,
"CT": 0.012101279,
"DE": 0.002784431,
"DC": 0.002032745,
@cjmatta
cjmatta / percentage_pop_by_zip.json
Created September 15, 2014 23:28
Percentage population by zip code
[
"01001": 5.43143e-05,
"01002": 9.40889e-05,
"01003": 3.35946e-05,
"01005": 1.64507e-05,
"01007": 4.74477e-05,
"01008": 4.09082e-06,
"01009": 2.40008e-06,
"01010": 1.16895e-05,
"01011": 4.43739e-06,
@cjmatta
cjmatta / Othello.txt
Created September 26, 2014 18:24
Othello
ACT I
SCENE I. Venice. A street.
Enter RODERIGO and IAGO
RODERIGO
Tush! never tell me; I take it much unkindly
That thou, Iago, who hast had my purse
As if the strings were thine, shouldst know of this.
IAGO
@cjmatta
cjmatta / TDCH Notes.md
Last active May 25, 2024 04:53
Teradata Notes

##Teradata TDCH and Parallel Transporter with MapR

##Environment Running on Peep.local: 192.168.1.26

###Tasks

  • Load sample data into Teradata database
  • Run queries to ensure it's in there
  • Move data using TDCH
  • Move data using TD Parallel transporter
@cjmatta
cjmatta / Drill Demo
Created October 5, 2014 23:58
Title
## test customer view
```SQL
select
cast(row_key as int) as row_key,
cast(`address`['state'] as VARCHAR(255)) as state,
cast(`loyalty`['agg_rev'] as VARCHAR(255)) as agg_rev,
cast(`loyalty`['membership'] as VARCHAR(255)) as membership,
@cjmatta
cjmatta / drill_twitter.md
Last active April 12, 2017 15:48
Examining Tweets with Drill

Tweets

SELECT 
CAST(`t`.`dir0` AS VARCHAR(255)) AS `topic`,
CAST(`t`.`dir1` AS INTEGER) AS `year`,
CAST(`t`.`dir2` AS INTEGER) AS `month`,
CAST(`t`.`dir3` AS INTEGER) AS `day`,
CAST(`t`.`dir4` AS INTEGER) AS `hour`,
CAST(`t`.`id` AS BIGINT) AS `id`,
CAST(`t`.`user`['id'] AS BIGINT) AS `user_id`,
@cjmatta
cjmatta / checkcomp.sh
Created October 25, 2014 00:55
MapR checkcomp
#!/bin/bash
# checkcomp - A script to show the relative compressed and uncompressd file sizes on a MapR filesystem.
# Chris Matta
# cmatta@mapr.com
#
# Currently broken when a directory has multiple dir children. Need to write a directory walk funciton.
set -o nounset
set -o errexit
@cjmatta
cjmatta / configureOozie_mapr.sh
Last active August 29, 2015 14:08
configureOozie_mapr.sh
#!/bin/bash
# Copyright (C) 2013 by Teradata Corporation.
# All Rights Reserved.
#
# This script installs tdch for Oozie transfers with Hadoop
#
# Version : $Id$
# MapR Notes
# Since MapR doesn't need a nameNode, we've removed it from this script
@cjmatta
cjmatta / run_drill_query.sh
Created November 19, 2014 02:54
A bash script for running a query against a drill cluster and then collecting the updated logs from the whole cluster.
#!/bin/bash
set -o nounset
set -o errexit
if [[ ! -f $1 ]];
then
echo "Query file ${1} not found, exiting.";
exit 1;
fi