Skip to content

Instantly share code, notes, and snippets.

View bnyeggen's full-sized avatar

Bryce Nyeggen bnyeggen

View GitHub Profile
@bnyeggen
bnyeggen / clojure_hive_thrift.clj
Created December 13, 2011 15:28
Clojure to Hive via Thrift
(comment You will want just about everything in your hive/lib dir included in your Classpath)
(ns myproj.core
(:import [org.apache.hadoop.hive.service HiveClient]
[org.apache.thrift.transport TSocket]
[org.apache.thrift.protocol TBinaryProtocol]))
(defn send-hive
"Creates a new socket and Hive client connection, runs the query, pulls the result, and closes the connection.
Eventually modify to split and parse according to schema of result.
@bnyeggen
bnyeggen / clojure_hive_jdbc.clj
Created December 13, 2011 15:12
Clojure to Hive via JDBC
(comment Add [org.clojure/java.jdbc "0.1.1"] to project dependencies)
(ns myproject.core
(:use [clojure.java.jdbc :only [with-connection, with-query-results]]))
(let [db-host "MyHost"
db-port 10000
db-name "default"]
(def db {:classname "org.apache.hadoop.hive.jdbc.HiveDriver" ; must be in classpath
:subname (str "//" db-host ":" db-port "/" db-name)
@bnyeggen
bnyeggen / multiprocess_with_instance_methods.py
Created July 16, 2011 14:17
Example showing how to use instance methods with the multiprocessing module
from multiprocessing import Pool
from functools import partial
def _pickle_method(method):
func_name = method.im_func.__name__
obj = method.im_self
cls = method.im_class
if func_name.startswith('__') and not func_name.endswith('__'): #deal with mangled names
cls_name = cls.__name__.lstrip('_')
func_name = '_' + cls_name + func_name
@bnyeggen
bnyeggen / raid_mtbf.py
Created July 11, 2011 22:54
A RAID MTBF calculator
#redundancy is the max number of survivable failures, so eg 1 for RAID5
#mtbf_array is an array of either actual mean-time-between-failures, or a nested RAID array
# RAID([100]*7,2) #7 disk RAID 6
# RAID([RAID([100]*3,1),RAID([1000]*3,1)],0) # RAID 50, 2 arrays of 3
# RAID([100,100,50,50],1) #RAID 5 with varying reliabilities
from random import random
class RAID(object):