Skip to content

Instantly share code, notes, and snippets.

View omalley's full-sized avatar

Owen O'Malley omalley

View GitHub Profile
@omalley
omalley / Sort.java
Created February 8, 2012 16:28
A patch that makes Hadoop's sort example support textual sorts
diff --git src/examples/org/apache/hadoop/examples/Sort.java src/examples/org/apache/hadoop/examples/Sort.java
index a028009..40c7647 100644
--- src/examples/org/apache/hadoop/examples/Sort.java
+++ src/examples/org/apache/hadoop/examples/Sort.java
@@ -20,13 +20,18 @@ package org.apache.hadoop.examples;
import java.io.IOException;
import java.net.URI;
-import java.util.*;
+import java.util.ArrayList;
@omalley
omalley / tweet-by-user.py
Created March 3, 2013 06:28
A python script that uses the twitter api to find the most prolific tweeters in your feed.
#!/usr/bin/python
import base64
import hmac
import hashlib
import httplib
import json
import random
import string
import sys
@omalley
omalley / github schema
Last active September 26, 2018 08:16
Auto-discovered Apache Hive schema for githubarchive.org's JSON logs
create table github (
actor: string,
actor_attributes: struct <
blog: string,
company: string,
email: string,
gravatar_id: binary,
location: string,
login: string,
name: string,
@omalley
omalley / sqrt.cc
Last active December 16, 2015 17:40
Computes the square roots of integers to arbitrary precision.
#include <math.h>
#include <iostream>
#include <string>
#include <sstream>
// Written by Owen O'Malley (omalley@apache.org)
// compile with: g++ -O sqrt.cc -lcln -lm
#define WANT_OBFUSCATING_OPERATORS
#include <cln/integer.h>
@omalley
omalley / mac2038.c
Created April 7, 2015 19:11
The following program demonstrates a 2038 bug in Mac OS 10.10.2.
#include <stdio.h>
#include <time.h>
/*
Demonstrates the Mac OS bug for 2038. The output on Mac OS/X 10.10.2 in PDT:
In 2036 2093587200 - 2093562000 = 25200
In 2037 2125126800 - 2125101600 = 25200
In 2038 2156666400 - 2156637600 = 28800
In 2039 2188202400 - 2188173600 = 28800
Verifying that +omalley is my blockchain ID. https://onename.com/omalley
@omalley
omalley / OrcWriter.java
Created November 24, 2015 19:00
A sample ORC writer using a dynamic schema
import java.io.IOException;
import java.util.ArrayList;
import java.util.Arrays;
import java.util.List;
import org.apache.hadoop.conf.Configuration;
import org.apache.hadoop.fs.FileSystem;
import org.apache.hadoop.fs.Path;
import org.apache.hadoop.io.IntWritable;
import org.apache.hadoop.io.Text;
@omalley
omalley / OrcWriter2.java
Created November 24, 2015 19:30
An example ORC writer using a dynamic schema in Hive 2.0
import org.apache.hadoop.conf.Configuration;
import org.apache.hadoop.fs.Path;
import org.apache.hadoop.hive.ql.exec.vector.BytesColumnVector;
import org.apache.hadoop.hive.ql.exec.vector.LongColumnVector;
import org.apache.hadoop.hive.ql.exec.vector.VectorizedRowBatch;
import org.apache.orc.CompressionKind;
import org.apache.orc.TypeDescription;
import org.apache.orc.OrcFile;
import org.apache.orc.Writer;
package org.apache.orc.examples;
import org.apache.hadoop.conf.Configuration;
import org.apache.hadoop.fs.Path;
import org.apache.hadoop.hive.ql.exec.vector.BytesColumnVector;
import org.apache.hadoop.hive.ql.exec.vector.LongColumnVector;
import org.apache.hadoop.hive.ql.exec.vector.VectorizedRowBatch;
import org.apache.orc.OrcFile;
import org.apache.orc.TypeDescription;
import org.apache.orc.Writer;
/**
* Licensed to the Apache Software Foundation (ASF) under one
* or more contributor license agreements. See the NOTICE file
* distributed with this work for additional information
* regarding copyright ownership. The ASF licenses this file
* to you under the Apache License, Version 2.0 (the
* "License"); you may not use this file except in compliance
* with the License. You may obtain a copy of the License at
*
* http://www.apache.org/licenses/LICENSE-2.0