Skip to content

Instantly share code, notes, and snippets.

View omalley's full-sized avatar

Owen O'Malley omalley

View GitHub Profile
@omalley
omalley / github schema
Last active September 26, 2018 08:16
Auto-discovered Apache Hive schema for githubarchive.org's JSON logs
create table github (
actor: string,
actor_attributes: struct <
blog: string,
company: string,
email: string,
gravatar_id: binary,
location: string,
login: string,
name: string,
@omalley
omalley / tweet-by-user.py
Created March 3, 2013 06:28
A python script that uses the twitter api to find the most prolific tweeters in your feed.
#!/usr/bin/python
import base64
import hmac
import hashlib
import httplib
import json
import random
import string
import sys
@omalley
omalley / Sort.java
Created February 8, 2012 16:28
A patch that makes Hadoop's sort example support textual sorts
diff --git src/examples/org/apache/hadoop/examples/Sort.java src/examples/org/apache/hadoop/examples/Sort.java
index a028009..40c7647 100644
--- src/examples/org/apache/hadoop/examples/Sort.java
+++ src/examples/org/apache/hadoop/examples/Sort.java
@@ -20,13 +20,18 @@ package org.apache.hadoop.examples;
import java.io.IOException;
import java.net.URI;
-import java.util.*;
+import java.util.ArrayList;