Skip to content

Instantly share code, notes, and snippets.

View krishnanraman's full-sized avatar

Krishnan Raman krishnanraman

View GitHub Profile
@krishnanraman
krishnanraman / human readable time difference.scala
Created May 3, 2017 02:05
human readable time difference java scala
scala> import java.util.concurrent.TimeUnit
import java.util.concurrent.TimeUnit
scala> val units = List((TimeUnit.DAYS,"days"),(TimeUnit.HOURS,"hours"), (TimeUnit.MINUTES,"minutes"), (TimeUnit.SECONDS,"seconds"))
units: List[(java.util.concurrent.TimeUnit, String)] = List((DAYS,days), (HOURS,hours), (MINUTES,minutes), (SECONDS,seconds))
scala> def humanReadable(timediff:Long):String = {
| val init = ("", timediff)
| units.foldLeft(init){ case (acc,next) =>
| val (human, rest) = acc
@krishnanraman
krishnanraman / vickrey auction.txt
Created May 1, 2017 22:12
vickrrey auction second price bidding
Vickrey Auction
System Description: There are n bidders
each bidder has a bid b(i) and a true value v(i) that the item is truly worth
payoff to bidder = v(i) - b(i)
payoff too bidder = 0, if I lose.
The winner of the auction has the highest bid.
However, the winner doesn't pay the highest bid, he pays the second highest bid.
P1
why ?
Too many images, how to search ?
a. human annotation impractical
b. Content Based Image Retrieval - cbir identifies low-level features (color/shape/texture),
need higher level because SEMANTIC GAP
SEMANTIC GAP= (gap b/w lower-level features & semantic concepts used by humans to describe images)
c. Automatic Image Annotation
@krishnanraman
krishnanraman / ClusterTree.scala
Created July 12, 2016 02:35
Visualize Decision Tree (especially regression trees ) in html5 using the canvas api
object ClusterTree {
/*
Draw a decision tree in html5 using the canvas api
Returns a valid html5 string, that can be persisted to some foo.html
*/
/*
define vertex V & edge E
package com.marin.dt
import org.apache.spark.rdd.RDD
import org.apache.spark.mllib.tree.DecisionTree
import org.apache.spark.mllib.tree.model.{DecisionTreeModel,Node}
import org.apache.spark.mllib.tree.configuration.Algo
import org.apache.spark.mllib.evaluation.RegressionMetrics
/*
Collection of routines to Prune a Decision Tree, & enable the following features
1. Traverse a decision tree
@krishnanraman
krishnanraman / some results.txt
Created May 26, 2016 21:26
contour integrals on 1/z^k
On the line segment joining 1 to i,
the contour integral of 1/z^k for various powers of k works out to:
Integral(1/z^1) <= 2
Integral(1/z^2) <= 2*sqrt(2)
Integral(1/z^3) <= 4
Integral(1/z^4) <= 4*sqrt(2)
Integral(1/z^5) <= 8
Integral(1/z^6) <= 8*sqrt(2)
...
Given a date like 2015-12-31, obtain 2015-12-30
*/
def getPreviousDay(mydate:String):String = {
val fmt = new SimpleDateFormat("yyyy-MM-dd")
val date = fmt.parse(mydate)
val cal = Calendar.getInstance
cal.setTime(date)
cal.add(Calendar.DAY_OF_MONTH,-1)
fmt.format(cal.getTime)
}
@krishnanraman
krishnanraman / Splitter.scala
Created May 19, 2016 20:54
PS: Use partitionBy(col* ) api in DataFrameWriter, NOT the code below ( which works but is so.... 2015 ).
import scala.reflect.ClassTag
import org.apache.spark.rdd.RDD
object Splitter {
def split[T:ClassTag, U:ClassTag](rdd:RDD[T], f:T=>U) = {
val splits = rdd.groupBy{ x:T => f(x) }
val keys:Seq[U] = splits.keys.collect().toSeq
keys.foreach{ key:U =>
splits
.filter{ x => x._1 == key } // get records that match this key
$ ls -l
0
1
2
3
4
5
6
7
8
Desired Sum: 1 Possible ? false Subset:List()
Desired Sum: 2 Possible ? true Subset:List(List(2))
Desired Sum: 3 Possible ? true Subset:List(List(2, 3))
Desired Sum: 4 Possible ? false Subset:List()
Desired Sum: 5 Possible ? true Subset:List(List(2, 3, 5), List(2, 5))
Desired Sum: 6 Possible ? false Subset:List()
Desired Sum: 7 Possible ? true Subset:List(List(2, 3, 5, 7), List(2, 5, 7), List(2, 7))
Desired Sum: 8 Possible ? true Subset:List(List(2, 3, 5, 7, 8))
Desired Sum: 9 Possible ? true Subset:List(List(2, 3, 5, 7, 8), List(2, 5, 7), List(2, 7))
Desired Sum: 10 Possible ? true Subset:List(List(2, 3, 5, 7, 8))