* gradle clean
* gradle build
* hadoop jar build/libs/pivot-table-1.0-SNAPSHOT.jar com.dvidr.PivotTable src/main/resources/pivotdata.txt src/main/resources/output
package com.dvidr;
import random
shuf = random.sample(range(100000), 1000)
# Keep bag of k=10 elements
# Prefill the bag with first shuf items
bag = shuf[:10]
import networkx as nx
import matplotlib.pyplot as plt
import matplotlib
# -------------- Get Data/Graphs -------------
with open('graph-data/g1.txt', 'r') as g:
g1 = map(lambda x: eval(x), g.readlines())
with open('graph-data/g2.txt', 'r') as g:
from __future__ import print_function
Using OpenDNS domain query activity, we retrieve 5 days
of queries/hour to a domain for 240+ domains (stored
in dns.json). We predict the number of queries in
the next hour using a LSTM recurrent neural network.
An ad hoc anomaly detection is outlined in the final
for loop.
View AliceInGraphXLand.scala
package com.dvidr
import org.apache.spark.graphx.{VertexRDD, Edge, Graph}
import org.apache.spark.sql._
import org.apache.spark.{SparkConf, SparkContext}
import scala.util.Random
case class Chat(id: Int, name: String, talk: String)
case class ChatGraph(id: Int, dst: Int, replyIn: Int, name: String, talk: String)
case class TopChat(id: Int, name: String, talk: String, inDeg: Int, outDeg: Int)
View AliceInTwitterLand.scala
package com.dvidr
import com.twitter.algebird.{Moments, Aggregator}
import scala.util.Random
Please refer to AliceInAggregatorLand first:
This is an adaption where "Alice In Wonderland" is turned
View MyMonoid.scala
package com.dvidr.storm.bolt
import com.twitter.algebird.Monoid
class MyMonoid[K, V <: Numeric] extends Monoid[Map[K, V]] {
override def zero = Map[K, V ]()
override def plus(x: Map[K, V], y: Map[K, V]): Map[K, V] = {
val list = x.toList ++ y.toList
return list.groupBy ( _._1) .map { case (k,v) => k -> }
View TwoAkkaStreams.scala
package com.dvidr
import{RunnableGraph, Sink, Source, Keep}
import scala.concurrent.Future
import scala.util.{Success, Failure}
View AkkaStreams.scala
package com.dvidr
import{Sink, Source}
import scala.concurrent.Future
import scala.util.{Success, Failure}