Skip to content

Instantly share code, notes, and snippets.

@vrivellino
vrivellino / ProcessKinesisRecords.js
Last active December 30, 2017 15:35
Lambda: archive kinesis stream to S3
View ProcessKinesisRecords.js
console.log('Loading function');
var AWS = require('aws-sdk'),
s3 = new AWS.S3(),
s3Bucket = 'archive-bucket',
s3Prefix = 'kinesis-archive-test',
s3Partitions = 2;
exports.handler = function (event, context) {
//console.log(JSON.stringify(event, null, 2));
@epiphani
epiphani / CDHTez.md
Last active July 5, 2018 14:13
Getting Tez enabled on CDH5.4+
View CDHTez.md

So Hive in CDH is horribly, painfully slow. Cloudera ships Hive 1.1, which is actually moderately modern. It is, however, very badly configured out of the box and patched with custom code from Cloudera. With a bit of effort, we managed to improve hive performance considerably. We really shouldn't have to do this, but Cloudera is actively working against supporting a performant Hive.

First, building Tez was fairly straightforward. Using the instructions at https://github.com/apache/tez/blob/master/docs/src/site/markdown/install.md, the only change was to use the version string "2.6.0" for the build. I believe that was the default. Don't use the CDH string, it won't work.

At the bottom of the installation instructions, there's mention of the fact that to use the local hadoop jars (rather than those packaged with tez) you must unpack the jars in HDFS rather than using the tarball. In this case, unpack the tez-minimal tarball and upload the contents to /apps/tez-0.7.0 (or whatever you prefer). Don't fo

@eridal
eridal / dp2dot.js
Created July 28, 2017 22:24
Build a dot model from a aws data pipeline json definition
View dp2dot.js
function merge (into, object) {
Object
.keys(object)
.forEach(k => {
let val = object[k]
if (val){
if (Array.isArray(val)) {
into[k] = [].concat(into[k] || [], val)
}
else if (typeof val === 'object') {
@9b
9b / poorMansConvert.py
Created May 19, 2013 20:01
Uses the Google Drive API to upload a file, convert it to a file format, download it locally and delete it from Drive.
View poorMansConvert.py
#!/usr/bin/python
def poorMansConvert(di, inPath, outType, outPath):
from apiclient.http import MediaFileUpload
valid_output = [
'text/html','text/plain','application/rtf','application/vnd.oasis.opendocument.text',\
'application/pdf','application/vnd.openxmlformats-officedocument.wordprocessingml.document',\
'application/vnd.openxmlformats-officedocument.spreadsheetml.sheet','application/x-vnd.oasis.opendocument.spreadsheet',\
'image/jpeg','image/png','image/svg+xml','application/vnd.openxmlformats-officedocument.presentationml.presentation'
@ambakshi
ambakshi / iam-assume-role.sh
Last active October 25, 2021 15:50
Assume an IAM role. An interesting way of doing IAM roles is to give the instance permissions to assume another role, but no actual permissions by default. I got this idea while setting up security monkey: http://securitymonkey.readthedocs.org/en/latest/quickstart1.html#setup-iam-roles.
View iam-assume-role.sh
#!/bin/bash
#
# Assume the given role, and print out a set of environment variables
# for use with aws cli.
#
# To use:
#
# $ eval $(./iam-assume-role.sh)
#
@linar-jether
linar-jether / simple_python_datasource.py
Last active May 22, 2022 12:26
Grafana python datasource - using pandas for timeseries and table data. inspired by and compatible with the simple json datasource ---- Up-to-date version maintained @ https://github.com/panodata/grafana-pandas-datasource
View simple_python_datasource.py
from flask import Flask, request, jsonify, json, abort
from flask_cors import CORS, cross_origin
import pandas as pd
app = Flask(__name__)
cors = CORS(app)
app.config['CORS_HEADERS'] = 'Content-Type'
@provegard
provegard / ssdp-test.py
Created December 5, 2011 21:52
Small SSDP server/client test in Python
View ssdp-test.py
#!/usr/bin/python
# Python program that can send out M-SEARCH messages using SSDP (in server
# mode), or listen for SSDP messages (in client mode).
import sys
from twisted.internet import reactor, task
from twisted.internet.protocol import DatagramProtocol
SSDP_ADDR = '239.255.255.250'
@alfredkrohmer
alfredkrohmer / xbox-one-wireless-protocol.md
Created November 23, 2016 21:52
XBox One Wireless Controller Protocol
View xbox-one-wireless-protocol.md

Physical layer

The dongle itself is sending out data using 802.11a (5 GHz WiFi) with OFDM and 6 Mbit/s data rate:

Radiotap Header v0, Length 38
    Header revision: 0
    Header pad: 0
    Header length: 38
    Present flags