This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
women | jeans | Levi's 512 | 512 Perfectly Slimming Bootcut, Midnight Star Wash jeans | 54.00 | 39.00 | |
---|---|---|---|---|---|---|
women | jeans | Tommy Hilfiger Jeans | Hope Boot Cut, Caroline Original Wash jeans | 69.50 | 54.00 | |
women | jeans | Lauren Jeans Co. Jeans | Slimming Bootcut, Rinse Wash jeans | 80.00 | 65.00 | |
women | jeans | INC International Concepts Jeans | Curvy-Fit Skinny Ankle-Length jeans | 69.00 | 48.99 | |
women | jeans | MICHAEL Michael Kors Jeans | Skinny Colored Denim jeans | 89.50 | 69.99 | |
women | tops | Karen Scott Top | Short-Sleeve Boat-Neck top | 29.00 | 12.99 | |
women | tops | Cable & Gauge Top | Three-Quarter-Sleeve Solid Twist Front top | 24.98 | 19.99 | |
women | tops | Style&co. Top | Long-Sleeve Striped Button-Down Shirt top | 39.98 | 29.99 | |
women | tops | Style&co. Top | Bell-Sleeve Printed Tunic top | 49.98 | 35.99 | |
women | tops | INC International Concepts Top | Short-Sleeve Ruched Tee top | 29.50 | 24.99 |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Bandolino Shoes | Rampage Shoes | 0.6702 | |
---|---|---|---|
Bandolino Shoes | Marc Fisher Shoes | 0.73206 | |
Bandolino Shoes | Nine West Shoes | 0.81128 | |
Cable & Gauge Top | INC International Concepts Top | 0.65449 | |
Cable & Gauge Top | Karen Scott Top | 0.68115 | |
Cable & Gauge Top | Style&co. Top | 0.9547 | |
INC International Concepts Jeans | MICHAEL Michael Kors Jeans | 0.69658 | |
INC International Concepts Jeans | Tommy Hilfiger Jeans | 0.75921 | |
INC International Concepts Jeans | Levi's 512 | 0.80887 | |
INC International Concepts Top | Karen Scott Top | 0.47929 |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
package com.mansur.scalding | |
import com.twitter.scalding._ | |
import org.apache.lucene.search.spell._ | |
import org.apache.mahout.common.distance.TanimotoDistanceMeasure | |
import org.apache.mahout.math.DenseVector | |
import org.apache.commons.math.util.MathUtils | |
/** |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
buildscript { | |
repositories { | |
maven { | |
url "http://repository-uncommon-configuration.forge.cloudbees.com/release/" | |
} | |
mavenCentral() | |
} | |
dependencies { | |
classpath 'org.github.mansur.oozie:gradle-oozie-plugin:0.1' |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
<workflow-app xmlns='xmlns=uri:oozie:workflow:0.1' name='oozie_flow'> | |
<start to='ingestor' /> | |
<action name='ingestor'> | |
<java> | |
<job-tracker>${jobTracker}</job-tracker> | |
<name-node>${nameNode}</name-node> | |
<configuration> | |
<property> | |
<name>mapred.job.queue.name</name> | |
<value>default</value> |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
object WordPair { | |
def main(args: Array[String]) = { | |
val line = "Android is a Linux-based operating system designed primarily for touchscreen mobile devices".split(" ").toList | |
val pairs = wordPair(line) | |
println(pairs) | |
} | |
def wordPair(line: List[String]): List[(String, String)] = line match { | |
case Nil => Nil |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
1. General Background and Overview | |
a) Probabilistic Data Structures for Web Analytics and Data Mining : On Highly Scalable Blog (http://highlyscalable.wordpress.com/2012/05/01/probabilistic-structures-web-analytics-data-mining/) : A great overview of the space of probabilistic data structures and how they are used in approximation algorithm implementation. | |
b) Models and Issues in Data Stream Systems : (http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.106.9846) | |
c) Approximate Frequency Counts over Data Streams by Gurmeet Singh Manku & Rajeev Motwani (http://www.vldb.org/conf/2002/S10P03.pdf) : One of the early papers on the subject. | |
d) Methods for Finding Frequent Items in Data Streams by Graham Cormode & Marios Hadjieleftheriou (http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.187.9800&rep=rep1&type=pdf) | |
e) The space complexity of approximating the frequency moments by Noga Alon, Yossi Matias, Mario Szegedy : one of the most influential papers introducing succinctness in computing fre |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
override def multiPut[K1 <: String](kvs: Map[K1, Option[String]]): Map[K1, Future[Unit]] = { | |
implicit val inj=StringCodec.utf8 | |
multiPutValues(kvs) | |
} | |
def multiPutValues[K:Codec, V:Codec](kvs: Map[K, Option[V]]): Map[K, Future[Unit]]={...} | |
Error:(84, 19) Cannot find Injection type class from K1 to Array[Byte] | |
multiPutValues(kvs) | |
^ |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
oozie { | |
def common_props = [ | |
jobTracker: '${jobTracker}', | |
namenode: '${nameNode}', | |
configuration: ["mapred.job.queue.name": "default"] | |
] | |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Class StreamingQueryService[T]{ | |
Request[T] | |
.flatmap{ | |
t => Seq(t) | |
} | |
.lookup(ReadibleStore) | |
OlderNewer