Skip to content

Instantly share code, notes, and snippets.

@abramsm
abramsm / gist:8943880
Last active August 29, 2015 13:56
Sample Hydra Map Section
"map":{
"filterOut":{"op":"chain", "filter":[
{"op":"field", "from":"UID", "filter":{"op":"trim"}},
{"op":"field", "from":"IP", "filter":{"op":"trim"}},
{"op":"field", "from":"TERMS", "filter":{"op":"trim"}},
{"op":"num", "columns":["QUERY_TIME", "QUERY_TIME_MOD"], "define":"c0,v1000,dmult,v1,set"},
{"op":"debug"},
]},
},
@abramsm
abramsm / gist:8941283
Created February 11, 2014 18:42
Sample Hydra Source for consuming log-synth data
"source":{
"type":"mesh2",
"hash":true,
"mesh":{
"files":["log-synth/sample*"],
},
"format":{
"type":"column",
"columns":["QUERY_TIME", "UID", "IP", "TERMS"],
"tokens":{
@abramsm
abramsm / gist:5484296
Created April 29, 2013 19:59
hll+ example with merge
for (int p = 10; p < 18; p++)
{
ICardinality[] hlls = new ICardinality[30];
int ecount = 15000;
int totalCount = ecount * 30;
for (int j = 0; j < 30; j++)
{
hlls[j] = new HyperLogLogPlus(p, 25);
for (int i = 0; i < ecount; i++)
{
@abramsm
abramsm / gist:5483928
Created April 29, 2013 19:09
example HLL+ for different p values
for (int p = 10; p < 18; p++)
{
HyperLogLogPlus hyperLogLogPlus = new HyperLogLogPlus(p, 25);
int count = 4200000;
for (int i = 0; i < count; i++)
{
hyperLogLogPlus.offer("i" + i);
}
long estimate = hyperLogLogPlus.cardinality();
double se = count * (1.04 / Math.sqrt(Math.pow(2, p)));
@abramsm
abramsm / gist:5297210
Last active December 15, 2015 17:29
Testing merging n small sets in HyperLogLogPLus
public static void main(final String[] args) throws Throwable {
long startTime = System.currentTimeMillis();
int numSets = 10;
int setSize = 1 * 1000 * 1000;
int repeats = 5;
HyperLogLogPlus[] counters = new HyperLogLogPlus[numSets];
for (int i = 0; i < numSets; i++) {
counters[i] = new HyperLogLogPlus(15, 15);
@abramsm
abramsm / HyperLogLogPlus.java
Created February 1, 2013 03:01
Offering object to HyperLogLog++
public boolean offer(Object o)
{
long x = MurmurHash.hash64(o);
switch (format)
{
case NORMAL:
// find first p bits of x
final long idx = x >>> (64 - p);
//Ignore the first p bits (the idx), and then find the number of leading zeros