Skip to content

Instantly share code, notes, and snippets.

@dimatkach
Last active October 2, 2017 13:23
Show Gist options
  • Save dimatkach/4a619664b894d5cd7a36aa025f530996 to your computer and use it in GitHub Desktop.
Save dimatkach/4a619664b894d5cd7a36aa025f530996 to your computer and use it in GitHub Desktop.
groupBy vs. floldLeft vs. for...
import scala.collection.mutable
/**
* Demonstrates performance difference between different implementations of counting duplicates in an array.
* The code below "as is" prints
* ```
* groupBy 25432
* fold 7591
* mutable 16233
* raw 16349
* ```
* demonstrating that the "naive" `.groupBy` implementation is the fastest.
* The relative difference in performance remains approximately the same as the input size grows.
* Increasing the number of unique elements (e.g, change 25 to 250), keeps the `.fold` numbers unchanged, at about 3 times
* lower than groupBy, while the other two show slight improvement.
* For example, for the 100000 inputs with 250 unique keys, the numbers are:
* ```
* groupBy 1534
* fold 591
* mutable 1273
* raw 1303
* ```
* This is, probably, explained by the second scan over the result, required by the first solution, but not by the others.
*/
object Benchmark {
/** Runs `f` repetedly for `secs` seconds,
* returns number of times it was run
*/
def bench(secs: Long)(f: => Unit) = {
var n = 0
val now = System.currentTimeMillis
val then = now + secs * 1000
while(System.currentTimeMillis < then) { f ; n += 1 }
n
}
def withGB[T](l: Seq[T]) = l.groupBy(identity).mapValues(_.size)
def withFold[T](l: Seq[T]) = l.foldLeft(Map.empty[T, Int].withDefault(_ => 0)) { case (m, e) => m.updated(e, m(e) + 1) }
def withMut[T](l: Seq[T]) = l.foldLeft(mutable.Map.empty[T, Int].withDefault(_ => 0)) { case (m, e) => m(e) += 1; m }
def withRaw[T](l :Seq[T]) = {
val m = mutable.Map.empty[T, Int].withDefault(_ => 0)
for(e <- l) m(e) += 1
m
}
def main(argv: Array[String]) {
val input = (1 to 10000).map(_ => scala.util.Random.nextInt(25).toString)
Seq(
"groupBy" -> withGB[String] _,
"fold" -> withFold[String] _,
"mutable" -> withMut[String] _,
"raw" -> withRaw[String] _
).foreach { case (title, fu) =>
print(title + " ")
println(bench(10) { fu(input) })
}
}
}
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment