Skip to content

Instantly share code, notes, and snippets.

Embed
What would you like to do?

I was curious about the results reported here, which reports that Scala's mutable maps are slower than Java's: http://www.infoq.com/news/2011/11/yammer-scala

In my tests, Scala's OpenHashMap equals or beats java's HashMap:

Insertion 100k elements (String keys) time in ms:

  • scala HashMap: 92.75
  • scala OpenHashMap: 14.03125
  • java HashMap: 15.78125

Insertion 100k elements (Integer keys) time in ms:

  • scala HashMap: 98.625
  • scala OpenHashMap: 13.71875
  • java HashMap: 23.5625

Hastily thrown together timing code below. I at least run enough trials to get a half second sample.

Scala's mutable HashMap implementation was known to be slower than HashMap, which motivated David MacIver to write OpenHashMap way back in 2008. Unfortunately, not many people know it.

package org.apache.spark

object Test {
  def time(warmup: Int)(block: => Any) = {
    for (i <- 1 to warmup) block // possibly trigger JIT on freq used methods in block
    var goodSample = false
    var start = 0L; var stop = 0L;
    var N = 10
    while (!goodSample) {
      start = System.currentTimeMillis
      for (i <- 1 to N) block
      stop = System.currentTimeMillis
      if (stop - start < 500) N = N*4 // require at least half a second for a decent sample
      else goodSample = true
    }
    val perOp = (stop.toDouble - start.toDouble) / N
    perOp
  }

  def main(args: Array[String]) {
    val vs = (0 to 100000).map(_ toString).toArray
    val warmup = 10
//
//    {
//      println(time(warmup) {
//        val m = new scala.collection.mutable.HashMap[String, Int];
//        var i = 0;
//        while (i < 100000) {
//          m.put(vs(i), i);
//          i = i + 1;
//        };
//      })
//
//      println(time(warmup) {
//        val m = new scala.collection.mutable.OpenHashMap[String, Int];
//        var i = 0;
//        var start = System.currentTimeMillis();
//        while (i < 100000) {
//          m.put(vs(i), i);
//          i = i + 1;
//        };
//      })
//
//      println(time(warmup) {
//        val m = new java.util.HashMap[String, Int];
//        var i = 0;
//        while (i < 100000) {
//          m.put(vs(i), i);
//          i = i + 1;
//        };
//      })
//    }

    // With Integer keys
    {
      println("scala HashMap: " + time(warmup) {
        val m = new scala.collection.mutable.HashMap[Int, Int];
        var i = 0;
        while (i < 100000) {
          m.put(i, i);
          i = i + 1;
        };
      })

      println("scala OpenHashMap: " + time(warmup) {
        val m = new scala.collection.mutable.OpenHashMap[Int, Int];
        var i = 0;
        var start = System.currentTimeMillis();
        while (i < 100000) {
          m.put(i, i);
          i = i + 1;
        };
      })
      println("Spark OpenHashMap: " + time(warmup) {
        val m = new org.apache.spark.util.collection.OpenHashMap[Int, Int]
        var i = 0
        while (i < 100000) {
          m(i) = i
          i = i + 1
        }
      })
      println("java HashMap: " + time(warmup) {
        val m = new java.util.HashMap[Int, Int];
        var i = 0;
        while (i < 100000) {
          m.put(i, i);
          i = i + 1;
        };
      })
    }
  }
}
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment