Last active
October 2, 2017 13:23
-
-
Save dimatkach/4a619664b894d5cd7a36aa025f530996 to your computer and use it in GitHub Desktop.
groupBy vs. floldLeft vs. for...
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
import scala.collection.mutable | |
/** | |
* Demonstrates performance difference between different implementations of counting duplicates in an array. | |
* The code below "as is" prints | |
* ``` | |
* groupBy 25432 | |
* fold 7591 | |
* mutable 16233 | |
* raw 16349 | |
* ``` | |
* demonstrating that the "naive" `.groupBy` implementation is the fastest. | |
* The relative difference in performance remains approximately the same as the input size grows. | |
* Increasing the number of unique elements (e.g, change 25 to 250), keeps the `.fold` numbers unchanged, at about 3 times | |
* lower than groupBy, while the other two show slight improvement. | |
* For example, for the 100000 inputs with 250 unique keys, the numbers are: | |
* ``` | |
* groupBy 1534 | |
* fold 591 | |
* mutable 1273 | |
* raw 1303 | |
* ``` | |
* This is, probably, explained by the second scan over the result, required by the first solution, but not by the others. | |
*/ | |
object Benchmark { | |
/** Runs `f` repetedly for `secs` seconds, | |
* returns number of times it was run | |
*/ | |
def bench(secs: Long)(f: => Unit) = { | |
var n = 0 | |
val now = System.currentTimeMillis | |
val then = now + secs * 1000 | |
while(System.currentTimeMillis < then) { f ; n += 1 } | |
n | |
} | |
def withGB[T](l: Seq[T]) = l.groupBy(identity).mapValues(_.size) | |
def withFold[T](l: Seq[T]) = l.foldLeft(Map.empty[T, Int].withDefault(_ => 0)) { case (m, e) => m.updated(e, m(e) + 1) } | |
def withMut[T](l: Seq[T]) = l.foldLeft(mutable.Map.empty[T, Int].withDefault(_ => 0)) { case (m, e) => m(e) += 1; m } | |
def withRaw[T](l :Seq[T]) = { | |
val m = mutable.Map.empty[T, Int].withDefault(_ => 0) | |
for(e <- l) m(e) += 1 | |
m | |
} | |
def main(argv: Array[String]) { | |
val input = (1 to 10000).map(_ => scala.util.Random.nextInt(25).toString) | |
Seq( | |
"groupBy" -> withGB[String] _, | |
"fold" -> withFold[String] _, | |
"mutable" -> withMut[String] _, | |
"raw" -> withRaw[String] _ | |
).foreach { case (title, fu) => | |
print(title + " ") | |
println(bench(10) { fu(input) }) | |
} | |
} | |
} |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment