Skip to content

Instantly share code, notes, and snippets.

@shengc
Last active May 25, 2016 02:51
Show Gist options
  • Save shengc/c1362729fe25778a6ae0 to your computer and use it in GitHub Desktop.
Save shengc/c1362729fe25778a6ae0 to your computer and use it in GitHub Desktop.

Idiomatic way of implementing zipWithIndex on List[A] in FP is using a State Monad,

  import scalaz.State

  def zipWithIndex[A](list: List[A]): List[(A, Int)] = 
    list.foldLeft(State.state[Int, List[(A, Int)]](List.empty))({ (state, a) => 
      for {
        ls <- state
        i <- State.get[Int]
        _ <- State.put[Int](i + 1)
      } yield (ls :+ (a, i))
    }).eval(0)

There are two problems for this impl. First, it is appending item to the List during fold, which brings over performance penalty. The second one is more severe. Can you find out what that is? Let's try this implementation through console anyway,

  scala> zipWithIndex(List.fill(10)('a'))
  res1: List[(Char, Int)] = List((a,0), (a,1), (a,2), (a,3), (a,4), (a,5), (a,6), (a,7), (a,8), (a,9))

  scala> zipWithIndex(List.fill(100)('a')).size
  res2: Int = 100

so far so good. Now let's try a list with really large size,

  scala> zipWithIndex(List.fill(100000)('a')).size
  java.lang.StackOverflowError
    at scalaz.IndexedStateT$$anonfun$flatMap$1.apply(StateT.scala:62)
    at scalaz.IndexedStateT$$anon$10.apply(StateT.scala:95)
    at scalaz.IndexedStateT$$anonfun$flatMap$1.apply(StateT.scala:62)
    at scalaz.IndexedStateT$$anon$10.apply(StateT.scala:95)
    at scalaz.IndexedStateT$$anonfun$flatMap$1.apply(StateT.scala:62)
    at scalaz.IndexedStateT$$anon$10.apply(StateT.scala:95)
    ...

Ah - the nested flatMap call on State monad blows the stack space, in other words, Fold on State is not stack safe! Now that we know what the problem is, is there a way we can get away with it? In FP, whenever recursion raises issue of SOE, one should consider refactoring the code by trampolining it. Can we count on Trampoline for the rescue in this case? Yes, we can!

First of all, we need a Monad Transformer of State on Trampoline. Let's implement that,

  import scalaz.{ StateT, Free, Trampoline }

  type TrampolineState[S, A] = StateT[Free.Trampoline, S, A]
  object TrampolineState {
    def apply[S, A](f: S => (S, A)) = StateT[Free.Trampoline, S, A] { s => Trampoline.done(f(s)) }
    def get[S]: TrampolineState[S, S] = TrampolineState { s => (s, s)  }
    def put[S](s: S): TrampolineState[S, Unit] = TrampolineState { _ => (s, ())  }
  }

Now, let's re-impement zipWithIndex using TrampolineState.

  import scalaz.std.list._
  import scalaz.syntax.foldable._

  type M[A] = TrampolineState[Int, A]

  def zipWithIndex2[A](list: List[A]): List[(A, Int)] = 
    list.foldLeftM[M, List[(A, Int)]](List.empty)({ (ls, a) =>
      for {
        i <- TrampolineState.get[Int]
        _ <- TrampolineState.put[Int](i + 1)
      } yield (ls :+ (a, i))
    }).eval(0).run

Do not worry about what foldLeftM is. It is just a generic version of foldLeft to fold on top of type M as long as M itself is a monad. Ok, let's give our second draft a try,

  scala> zipWithIndex2(List.fill(10)('a'))
  res13: List[(Char, Int)] = List((a,0), (a,1), (a,2), (a,3), (a,4), (a,5), (a,6), (a,7), (a,8), (a,9))

  scala> zipWithIndex2(List.fill(100)('a')).size
  res14: Int = 100

so far so good - let's now try the one that blows the stack space for the 1st draft of zipWithIndex,

  scala> zipWithIndex2(List.fill(100000)('a')).size
  res16: Int = 100000

Yay, so the new implementation is indeed stack safe now ! Note that it does take more than usual to have the result returned, since, remember, we are doing list appending all the time in the fold operation.

But this implementation is still a little unsatisfying in that the type alias of M seems to be unnecessary. Let's try if we can get rid of it,

  scala> :paste
  // Entering paste mode (ctrl-D to finish)

  def zipWithIndex2[A](list: List[A]): List[(A, Int)] =
    list.foldLeftM[({type M[x] = TrampolineState[Int, x]})#M, List[(A, Int)]](List.empty)({ (ls, a) =>
      for {
        i <- TrampolineState.get[Int]
        _ <- TrampolineState.put[Int](i + 1)
      } yield (ls :+ (a, i))
    }).eval(0).run

  // Exiting paste mode, now interpreting.

  <console>:20: error: could not find implicit value for parameter M: scalaz.Monad[[x]scalaz.IndexedStateT[[A]scalaz.Free[Function0,A],Int,Int,x]]
         list.foldLeftM[({type M[x] = TrampolineState[Int, x]})#M, List[(A, Int)]](List.empty)({ (ls, a) =>

Urgh the type projection does not work, as we are hitting the limitation of Scala compiler here :-( (Miles Sabin recently opened a pull request (https://github.com/milessabin/si2712fix-plugin) to address SI-2712. If accepted, foldLeftM should be able to be called without specifying types explicitly. Daniel Spiewak has a great gist to introduce it: https://gist.github.com/djspiewak/7a81a395c461fd3a09a6941d4cd040f2)

Is there still a way to reduce the boilerplate at all? Actually, yes - thanks to some very genius work done in scalaz.

  import scalaz.std.list._
  import scalaz.syntax.traverse._

  def zipWithIndex3[A](list: List[A]): List[(A, Int)] = 
    list.traverseU { a =>
      for {
        i <- TrampolineState.get[Int]
        _ <- TrampolineState.put[Int](i + 1)
      } yield (a, i)
    } eval 0 run

traverseU makes use of an implicit Unapply instance to help compiler figure out the right type to traverse the List with. Check out Travis Brown's blog if you are interested in more details about that, https://meta.plasm.us/posts/2015/07/11/roll-your-own-scala/

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment