Skip to content

Instantly share code, notes, and snippets.

View cstenac's full-sized avatar
🏠
Working from home

Clément Stenac cstenac

🏠
Working from home
View GitHub Profile
@cstenac
cstenac / ConvertToCelcius.java
Last active December 13, 2015 22:28
Simple Hive UDF (1)
/** A simple UDF to convert Celcius to Fahrenheit */
public class ConvertToCelcius extends UDF {
public double evaluate(double value) {
return (value - 32) / 1.8;
}
}
@cstenac
cstenac / AbsValue.java
Last active December 13, 2015 22:28
Simple Hive UDF (2)
/** A simple UDF to get the absolute value of a number */
public class AbsValue extends UDF {
public double evaluate(double value) {
return Math.abs(value);
}
public long evaluate(long value) {
return Math.abs(value);
}
public int evaluate(int value) {
return Math.abs(value);
@cstenac
cstenac / ArraySum.java
Created February 19, 2013 09:01
Simple Hive UDF (3)
public class ArraySum extends UDF {
public double evaluate(List<Double> value) {
double sum = 0;
for (int i = 0; i < value.size(); i++) {
if (value.get(i) != null) {
sum += value.get(i);
}
}
return sum;
}
@cstenac
cstenac / GenericUDF.java
Created February 19, 2013 09:03
Hive GenericUDF interface (simplified)
public interface GenericUDF {
public Object evaluate(DeferredObject[] args) throws HiveException;
public String getDisplayString(String[] args);
public ObjectInspector initialize(ObjectInspector[] args) throws UDFArgumentException;
}
@cstenac
cstenac / UDFArrayFirst.java
Created February 19, 2013 09:24
Hive Generic UDF : array_first
import org.apache.hadoop.hive.ql.exec.UDFArgumentException;
import org.apache.hadoop.hive.ql.metadata.HiveException;
import org.apache.hadoop.hive.ql.udf.generic.GenericUDF;
import org.apache.hadoop.hive.serde2.objectinspector.ListObjectInspector;
import org.apache.hadoop.hive.serde2.objectinspector.ObjectInspector;
import org.apache.hadoop.hive.serde2.objectinspector.ObjectInspector.Category;
public class UDFArrayFirst extends GenericUDF {
ListObjectInspector listInputObjectInspector;
@cstenac
cstenac / UDFMultiplyByTwo.java
Created February 19, 2013 09:35
Hive Generic UDF : multiply input integer by two
import org.apache.hadoop.hive.ql.exec.UDFArgumentException;
import org.apache.hadoop.hive.ql.metadata.HiveException;
import org.apache.hadoop.hive.ql.udf.generic.GenericUDF;
import org.apache.hadoop.hive.serde2.objectinspector.ObjectInspector;
import org.apache.hadoop.hive.serde2.objectinspector.ObjectInspector.Category;
import org.apache.hadoop.hive.serde2.objectinspector.PrimitiveObjectInspector;
import org.apache.hadoop.hive.serde2.objectinspector.PrimitiveObjectInspector.PrimitiveCategory;
import org.apache.hadoop.hive.serde2.objectinspector.primitive.PrimitiveObjectInspectorFactory;
import org.apache.hadoop.io.IntWritable;
Process: node [56095]
Path: /Applications/TileMill.app/Contents/Resources/node
Identifier: node
Version: ???
Code Type: X86-64 (Native)
Parent Process: node [56091]
User ID: 501
Date/Time: 2013-07-11 19:05:23.631 +0200
OS Version: Mac OS X 10.8.4 (12E55)
@cstenac
cstenac / gist:a602e8f630958580740f
Created October 10, 2014 14:05
Dataiku skutils library
import StringIO
from sklearn.tree import _tree
import random
def dataframe_train_test_split(size, X, Y):
## sklearn.cross_validation would not respect X / Y original index
if type(size) != int:
size = int (len(X.index) * size)
@cstenac
cstenac / autorestart.sh
Created July 27, 2016 06:37
Automatic restart of DSS if get-configuration call times out
#! /bin/sh
RESTART_SCRIPT=/path/to/dssrestart.sh
STACKS_BACKUP_DIR=/path/to/dir
DSS_URL=http://localhost:32000
MAX_TIME=600 # 10 minutes timeout
curl -s -m $MAX_TIME $DSS_URL/dip/api/get-configuration > /dev/null
RETCODE=$?

Keybase proof

I hereby claim:

  • I am cstenac on github.
  • I am cstenac (https://keybase.io/cstenac) on keybase.
  • I have a public key ASBp4SkWXpY9gqiw4R5Hjc0h0BCEMx7eC6k3ta5PyN51UAo

To claim this, I am signing this object: