Skip to content

Instantly share code, notes, and snippets.

View rjurney's full-sized avatar

Russell Jurney rjurney

View GitHub Profile
@rjurney
rjurney / dhcp.json
Created August 12, 2014 22:15
DHCP Data
{
u'cefSignatureId': u'DHCPACK',
u'destinationAddress': u'45.102.1.162',
u'destinationDnsDomain': u'cathedraticumunwedded',
u'destinationHostName': u'cathedraticumunwedded',
u'destinationMacAddress': u'2b: f5: 32: b8: 92: 99',
u'destinationNameOrIp': u'cathedraticumunwedded',
u'destinationNtDomain': None,
u'deviceAddress': None,
u'deviceDnsDomain': None,
@rjurney
rjurney / pom.xml
Created August 15, 2014 21:05
My pom.xml
<dependency>
<groupId>org.apache.hbase</groupId>
<artifactId>hbase</artifactId>
<version>0.94.6-cdh4.4.0</version>
</dependency>
@rjurney
rjurney / foo.screen
Created August 23, 2014 19:40
Why are my controllers responding with 404?
Sinatra doesn’t know this ditty.
Try this:
get '/network' do
"Hello World"
end
@rjurney
rjurney / fun.scala
Created August 23, 2014 19:43
Fun with Scala
@GET
@Path("/groupHttpTimeSeries")
def getTimeSeriesForAllGroups(@QueryParam("startTime") startDateStr : String,
@QueryParam("endTime") endDateStr : String,
@QueryParam("type") typeStr : String,
@QueryParam("period") periodInt : Int) = {
val buf = httpTimeSeriesDao.getTimeSeriesForAllGroups(startDateStr, endDateStr, typeStr, periodInt) // Group by the unique fields
val timeEntries = buf.map(x => scala.collection.mutable.Map[String,Any](
"type" -> x.get("type"),
"groupField" -> x.get("groupField"),
@rjurney
rjurney / graph.gdf
Created August 23, 2014 22:24
Broken GDF Parsing
nodedef> name VARCHAR,label VARCHAR,width DOUBLE,height DOUBLE,x DOUBLE,y DOUBLE,color VARCHAR,group INTEGER,PageRank DOUBLE default 0.0,Modularity Class INTEGER default 0
Openosmium,"Openosmium",3.910848,3.910848,471.56708,-473.07312,'1,121,211',3,9.68751141700948E-4,12
Amazon Web Services,"Amazon Web Services",16.691055,16.691055,-274.28784,-539.70593,'211,181,1',0,0.005222096681049561,12
Cloudera,"Cloudera",35.35964,35.35964,-495.23416,-334.80838,'211,181,1',0,0.011435134917504607,12
Mapr,"Mapr",21.85529,21.85529,-596.565,-481.3974,'211,181,1',0,0.006940790291135599,12
Hortonworks,"Hortonworks",50.0,50.0,-987.1078,-273.9822,'211,181,1',0,0.016307553457128563,12
At&T,"At&T",3.695927,3.695927,-1155.9557,-238.05418,'211,181,1',0,8.972238723982221E-4,9
Csc,"Csc",5.038145,5.038145,-936.2015,-1289.719,'211,181,1',0,0.001343923876290424,9
Hewlatt Packard,"Hewlatt Packard",13.161046,13.161046,-946.6715,-384.9222,'211,181,1',0,0.004047284056073951,12
Google,"Google",7.8337407,7.8337407,-193.3688,-367.41632,'211,181
@rjurney
rjurney / ChooseByFieldName.java
Created September 11, 2014 21:05
UDF for Pig to choose a field by the value of another field
// Enable multiple languages by specifying the model path. See http://text.sourceforge.net/models-1.5/
public Tuple exec(Tuple input) throws IOException
{
if(input.size() < 2) {
throw new IOException();
}
String fieldName = input.get(0).toString();
if(fieldName == null || fieldName == "") {
return null;
@rjurney
rjurney / ChooseFieldByValue.java
Last active August 29, 2015 14:06
ChooseFieldByJava UDF with problemos
public class ChooseFieldByValue extends EvalFunc<Tuple>
{
private TupleFactory tf = TupleFactory.getInstance();
// Enable multiple languages by specifying the model path. See http://text.sourceforge.net/models-1.5/
public Tuple exec(Tuple input) throws IOException
{
if(input.size() < 2) {
throw new IOException();
}
@rjurney
rjurney / gist:fd5c0110fe7eb686afc9
Last active August 29, 2015 14:07
What is wrong with this PySpark JOIN?
from pyspark import SparkContext
from pyspark.sql import SQLContext
sc = SparkContext("Test App")
lines = sc.textFile("devices.tsv")
devices = lines.map(lambda x: x.split("\t"))
dh = sc.textFile("devices.header").collect()[0].split("\t")
devices = devices.map(lambda x: {'deviceid': x[dh.index('id')], 'foo': x[dh.index('foo')], 'bar': x[dh.index('bar')]})
# Try a bigger dataset and join
th = sc.textFile("transactions.header").collect()[0].split("\t")
@rjurney
rjurney / run.sh
Last active August 29, 2015 14:10
rake helpers:run
Russells-MacBook-Pro-OLD:cluster rjurney$ rake helpers:run
docker run --name=host_filer --hostname=host-filer --volume=/var/lib/docker/hosts:/srv/hosts --volume=/var/run/docker.sock:/var/run/docker.sock --detach blalor/docker-hosts /srv/hosts
2014/11/19 20:34:00 Error response from daemon: Conflict, The name host_filer is already assigned to 64b7400ba3a9. You have to delete (or rename) that container to be able to assign host_filer to a container again.
Command docker run --name=host_filer --hostname=host-filer --volume=/var/lib/docker/hosts:/srv/hosts --volume=/var/run/docker.sock:/var/run/docker.sock --detach blalor/docker-hosts /srv/hosts exited unsuccessfully (1)
@rjurney
rjurney / cogroup_join_equivalent.pig
Created December 26, 2014 19:31
A Pig JOIN is just a COGROUP/FLATTEN
slugging_fatness_join = JOIN fatness BY player_id, slugging_stats BY player_id;
-- Equivalent
slugging_fatness_join = FOREACH
(COGROUP fatness BY player_id, slugging_stats BY player_id)
GENERATE
FLATTEN(fatness),
FLATTEN(slugging_stats);