Skip to content

Instantly share code, notes, and snippets.

View krishnanraman's full-sized avatar

Krishnan Raman krishnanraman

View GitHub Profile
import com.twitter.scalding._
import TDsl._
class mult(args:Args) extends Job(args) {
def mkRow(columns:Int, dominant:Int):Seq[Double] = Seq.tabulate[Double](columns)(i=> if (i==dominant) 5+math.random*10 else math.random)
val n:Int = args("rows").toInt
/* Typed - FAILS :( OOM for n as small as 10,000
TypedPipe.from(0 to n)
.map{
@krishnanraman
krishnanraman / mult.scala
Created July 27, 2014 01:16
unorthodox benchmark
n Time(real)
============
Typed ($ time scald.rb --hdfs-local mult.scala --typed true --n XXX)
=====
250 7.4
500 8.6
1000 14.1
2000 36.0
@krishnanraman
krishnanraman / gist:e8289eac92d28856e089
Created July 27, 2014 05:05
sets in typed vs fields
scala> TypedPipe.from((0 until 10).toList).groupAll.size.dump
14/07/26 21:58:42 INFO flow.Flow: [ScaldingShell] starting
14/07/26 21:58:42 INFO flow.Flow: [ScaldingShell] source: MemoryTap["NullScheme"]["0.6803924901219981"]
14/07/26 21:58:42 INFO flow.Flow: [ScaldingShell] sink: MemoryTap["NullScheme"]["0.3895948780547176"]
14/07/26 21:58:42 INFO flow.Flow: [ScaldingShell] parallel execution is enabled: true
14/07/26 21:58:42 INFO flow.Flow: [ScaldingShell] starting jobs: 1
14/07/26 21:58:42 INFO flow.Flow: [ScaldingShell] allocating threads: 1
14/07/26 21:58:42 INFO flow.FlowStep: [ScaldingShell] starting step: local
((),10)
@krishnanraman
krishnanraman / A.scala
Last active August 29, 2015 14:04
ExecutionContext
import com.twitter.scalding.typed.MemorySink
import com.twitter.scalding._
import com.twitter.scalding.ExecutionContext._
import com.twitter.algebird.monad._
class A(args:Args)extends ExecutionContextJob(args) {
override def job: Reader[ExecutionContext, Nothing] = {
val (r, stats) = Execution.waitFor(Config.default, Local(false)) { implicit ec: ExecutionContext =>
val sink = new MemorySink[(Int, Int)]
@krishnanraman
krishnanraman / e.scala
Last active August 29, 2015 14:04
Compute Euler's number using Iterative Map-Reduce via ExecutionContextJob
import com.twitter.scalding._
import com.twitter.scalding.ExecutionContext._
import com.twitter.algebird.monad._
class e(args:Args)extends ExecutionContextJob(args) {
def factorial(x:Int): Long = {assert (x<21); if (x==0) 1 else x*factorial(x-1) }
override def job: Reader[ExecutionContext, Nothing] = {
Execution.waitFor(Config.default, Local(false)) { implicit ec: ExecutionContext =>
@krishnanraman
krishnanraman / testconv.scala
Created August 2, 2014 01:27
How many numbers do you need to add to exceed 2000 ?
$ scald.rb --hdfs-local testconv.scala
compiling testconv.scala
scalac -classpath /Users/kraman/.sbt/boot/scala-2.9.3/lib/scala-library.jar:/Users/kraman/.sbt/boot/scala-2.9.3/lib/scala-compiler.jar:/Users/kraman/workspace/scalding/scalding-core/target/scala-2.9.3/scalding-core-assembly-0.11.1.jar:/var/folders/b_/17q0nsss269_2kf855mtg4_c0000gn/T/maven/hadoop-core-1.1.2.jar:/var/folders/b_/17q0nsss269_2kf855mtg4_c0000gn/T/maven/commons-codec-1.8.jar:/var/folders/b_/17q0nsss269_2kf855mtg4_c0000gn/T/maven/commons-configuration-1.9.jar:/var/folders/b_/17q0nsss269_2kf855mtg4_c0000gn/T/maven/jackson-asl-0.9.5.jar:/var/folders/b_/17q0nsss269_2kf855mtg4_c0000gn/T/maven/jackson-mapper-asl-1.9.13.jar:/var/folders/b_/17q0nsss269_2kf855mtg4_c0000gn/T/maven/commons-lang-2.6.jar:/var/folders/b_/17q0nsss269_2kf855mtg4_c0000gn/T/maven/slf4j-log4j12-1.6.6.jar:/var/folders/b_/17q0nsss269_2kf855mtg4_c0000gn/T/maven/log4j-1.2.15.jar:/var/folders/b_/17q0nsss269_2kf855mtg4_c0000gn/T/maven/commons-httpclient-3.1.jar:/var/folders/b
@krishnanraman
krishnanraman / h3
Last active August 29, 2015 14:05
Fun problem sets from Pinter: h3-8, p43. h7 requires "stroke of genius". The rest are elementary.
If xax = e, then (xa)^2n = a^n
xax = e
=> xaxa = a
(xa)^2 = a
(xa)^2n = a^n
QED.
case class BisectPipe[T](pipe:TypedPipe[T], size:Int, sortBy: T => Int) {
def top = BisectPipe(pipe.groupAll.sortBy{ case x:T => sortBy(x) }.take((size/2).toInt).values, (size/2).toInt, sortBy)
def bottom = BisectPipe(pipe.groupAll.sortBy{ case x:T => sortBy(x) }.drop((size/2).toInt).values, (size/2).toInt, sortBy)
}
scala> def sortBy(x:Int) = x
scala> val bp = BisectPipe(TypedPipe.from((10 to 1 by -1).toList), 10, sortBy )
bp: BisectPipe[Int] = BisectPipe(IterablePipe(List(10, 9, 8, 7, 6, 5, 4, 3, 2, 1)),10,<function1>)
import com.twitter.scalding._
import com.twitter.scalding.ExecutionContext._
import com.twitter.algebird.monad._
import com.twitter.scalding.typed.MemorySink
case class BisectPipe[T](pipe:TypedPipe[T], size:Int, sortBy: T => Double) {
def top = BisectPipe(pipe.groupAll.sortBy{ case x:T => sortBy(x) }.take((size/2).toInt).values, (size/2).toInt, sortBy)
def bottom = BisectPipe(pipe.groupAll.sortBy{ case x:T => sortBy(x) }.drop((size/2).toInt).values, (size/2).toInt, sortBy)
}
package example
import scala.scalajs.js
import js.annotation.JSExport
import org.scalajs.dom
import scala.util.Random
import scala.math.{sin, Pi}
object BarAndPieChart extends js.JSApp {
val H = 3000