Skip to content

Instantly share code, notes, and snippets.

@debasishg

debasishg/fold.md

Created Mar 21, 2020
Embed
What would you like to do?
some links on foldl/foldr

Original : https://github.com/hasura/graphql-engine/pull/2933#discussion_r328821960

Yes, that’s right. It never makes sense to use foldl on lists because it never has any benefit and will always leak space.

To explain why, I wrote a mini blog post explaining the difference between foldl and foldr in Haskell. To start, you have to understand that foldl and foldr are not folds “from the left” and “from the right.” Both foldl and foldr traverse the structure in the same order, which in the case of lists means left to right. The difference is the fold’s associativity.

foldl vs foldr illustrated The best way to think about this is with an illustration. When you write

foldl (⨂) v [e0, e1, e2, ..., en−1, en] you’re performing the following computation:

( ... (((v ⨂ e0) ⨂ e1) ⨂ e2) ⨂ ... ⨂ en−1) ⨂ en In contrast, when you write

foldr (⨂) v [e0, e1, e2, ..., en−1, en] you’re performing this computation:

e0 ⨂ (e1 ⨂ (e2 ⨂ ... ⨂ (en−1 ⨂ (en ⨂ v)) ... )) See the difference? In both expressions, the elements of the list appear in the expression in the same order—from left to right—but the grouping changes. With foldl, the applications of (⨂) are left-associated, while with foldr, they’re right-associated.

foldl vs. foldr, strictly The question is: how does this difference actually impact the behavior of a program? Well, let’s start by first thinking about what the difference would be in a strict language. In a strict language, evaluation order always proceeds from the “inside out,” starting with the most deeply nested expression.

Let’s think about that in the context of foldl first. Let’s say we wrote this expression:

foldl (+) 0 [1, 2, 3, 4] By the above illustration, we know that expression is equivalent to this one:

(((0 + 1) + 2) + 3) + 4 Reducing from the inside out, we get the following reduction sequence:

  foldl (+) 0 [1, 2, 3, 4]
= (((0 + 1) + 2) + 3) + 4
= (( 1      + 2) + 3) + 4
= (  3           + 3) + 4
=    6                + 4
=   10

In contrast, if we had used foldr, we’d get the same result (since (+) is an associative, commutative operation), but with a slightly different reduction sequence:

  foldr (+) 0 [1, 2, 3, 4]
= 1 + (2 + (3 + (4 + 0)))
= 1 + (2 + (3 +      4 ))
= 1 + (2 +           7  )
= 1 +                9
= 10

What’s the practical difference between these two things? Well, note the following detail: with foldl, to start reducing, we only need the first element of the list, but with foldr, we have to start from the end of the list and reduce “backwards.”1 Practically, this means foldl can be tail-recursive, reducing as it traverses the list in constant space, while foldr cannot be: to reduce a list of length n with foldr, you need to create n stack frames before any reduction can start.

1 This is why foldr is sometimes described as “folding from the right”, even though it traverses the list from left to right. As we’ll see, however, this doesn’t actually hold in lazy languages!

foldl, lazily But what about in lazy languages, like Haskell? In a lazy language, evaluation order doesn’t proceed from the “inside out” like it does in strict languages, but rather from the “outside in,” evaluating expressions only as their results are demanded.

In a strict language, foldl (+) 0 [1, 2, 3, 4] doesn’t actually get turned into the expression (((0 + 1) + 2) + 3) + 4; as I mentioned earlier, it’s implemented as a tail-recursive loop. But in Haskell, it basically does get expanded into that expression before any reduction starts—each application of (+) is lazily suspended in a thunk.

If we explicitly denote thunks with ⟨⟩ brackets, the thunk we’ll end up with looks like this:

⟨⟨⟨⟨0 + 1⟩ + 2⟩ + 3⟩ + 4⟩ These thunks will only get forced when the outermost thunk is evaluated, which will cause (+) to be applied to the arguments ⟨⟨⟨0 + 1⟩ + 2⟩ + 3⟩ and 4. Since (+) is strict in both its arguments, it will force the next thunk, which will in turn apply (+) to ⟨⟨0 + 1⟩ + 2⟩ and 3, and so on until the whole thunk tree is reduced.

The result ends up being the same, but from a practical point of view, this is really bad, because instead of reducing the list in constant space, like we did in the strict language, we’re now creating a thunk linear in the size of the input list! It’s even worse that that, though, because in a strict language, the input list takes up space linear to its own size, so our overall space consumption for producing/consuming the list would simply be a constant factor increase, but in a lazy language, the list itself is more like a stream, which may actually use constant space if the whole stream is not fully realized in memory. By using the lazy foldl, we’ve possibly gone from a constant-space algorithm to a linear-space algorithm, which is bad!

What we want is to recover the behavior of the strict language’s foldl, efficiently updating an accumulator as we traverse the list, not building thunks. Therefore, we need a stricter version of foldl, which is exactly what foldl' is. foldl' places a demand on the ⟨0 + 1⟩ thunk before moving onto the next element of the list, so instead of building a larger ⟨⟨0 + 1⟩ + 2⟩ thunk, it simply builds a ⟨1 + 2⟩ thunk. foldl' continues traversing the list in constant space, never building a thunk larger than a single application of (+).

foldr, lazily But what about foldr? Remember that in a strict language, foldr already needed to consume space linear in the size of the input list, since it fundamentally needed the last element in the list before it could start reducing. Indeed, if we consider a lazy foldr with a strict operation like (+), this is still true—but in an interestingly different way from foldl.

With foldl, we ended up building thunks incrementally as we traversed the list, leading to a very large, nested thunk. But with foldr, that doesn’t actually happen. Why? Well, consider the expansion again for just a moment:

  foldr (+) 0 [1, 2, 3, 4]
= 1 + (2 + (3 + (4 + 0)))

To consider how we end up with this expansion, let’s write the expansion out in an explicitly inductive way:

  foldr (+) 0 [1, 2, 3, 4]
= 1 + foldr (+) 0 [2, 3, 4]
= 1 + (2 + foldr (+) 0 [3, 4])
= 1 + (2 + (3 + foldr (+) 0 [4]))
= 1 + (2 + (3 + (4 + foldr (+) 0 [])))
= 1 + (2 + (3 + (4 + 0)))

This makes the recursive calls to foldr more explicit. In a strict language, as soon as we call foldr, we need to demand the result, so we have to traverse the whole list. But here’s where things get interesting—in a lazy language, we can actually just return the following result, with a suspended thunk:

  foldr (+) 0 [1, 2, 3, 4]
= 1 + ⟨foldr (+) 0 [2, 3, 4]⟩

This might seem totally irrelevant, since once that result is forced, the ⟨foldr (+) 0 [2, 3, 4]⟩ will be forced by (+), and we’ll get the same reduction sequence we had before. But note that this is only true because (+) is a strict operation. What if we instead used a lazy operation, like (:)? In that case, we’d get the following expansion:

  foldr (:) [] [1, 2, 3, 4]
= 1 : ⟨foldr (:) [] [2, 3, 4]⟩

Guess what? That result is already in weak-head normal form (WHNF)! So evaluation just stops there until the rest of the result is explicitly demanded by something else. Now, in this case, this is a silly operation, since foldr (:) [] is just a complicated identity function on lists, but we could imagine a slightly more complicated function, such as one that doubles each element in a list:

let f x xs = (x * 2) : xs
in foldr f [] [1, 2, 3, 4]

This will expand into the following:

  foldr f [] [1, 2, 3, 4]
= ⟨1 * 2⟩ : ⟨foldr f [] [2, 3, 4]⟩

…and again, it will just stop there, since it’s already in WHNF. How is this useful? Well, what if we didn’t actually consume the entire result list, like this?

sum (take 2 (foldr f [] [1, 2, 3, 4])) Because take 2 will only return the first two elements of the list, then when sum forces the list and its values to add them together, it will never even evaluate the thunk ⟨foldr f [] [3, 4]⟩, and the list will only be partially-traversed.

What are the implications of this? Well, it means that foldr can possibly save on work if the reducing function is lazy in its second argument, and the result list is not entirely consumed. In fact, foldr can operate on infinite lists this way, while foldl cannot. It also means that foldr may be subject to more list fusion than foldl, though that’s another discussion entirely.

foldl vs. foldr, lazily Okay, so, to briefly recap, here’s what I’ve said so far:

In a lazy language, foldl on lists is bad because it’s too lazy, and it builds up big thunks. Use foldl' instead to force the thunks incrementally and consume the list in constant space.

In a lazy language, foldr on lists is good because it’s lazy, so if the reducing function is lazy in its second argument, it can save on work.

These two things might seem a little contradictory. Why is foldl bad because it’s too lazy while foldr is good because it’s lazy?

To understand the difference, let’s expand foldl inductively like we did with foldr:

  foldl (+) 0 [1, 2, 3, 4]
= foldl (+) (0 + 1) [2, 3, 4]
= foldl (+) ((0 + 1) + 2) [3, 4]
= foldl (+) (((0 + 1) + 2) + 3) [4]
= foldl (+) ((((0 + 1) + 2) + 3) + 4) []
= ((((0 + 1) + 2) + 3) + 4)

See the difference? With foldr, the recursive call was pushed into a “leaf” of the resulting expression tree, but with foldl, the recursive call is always the root. This is, by the way, why foldl is tail recursive—this is exactly what tail recursion is!—but it means it can’t possibly be lazy, since it will never be in WHNF until the entire list has been traversed.

This gives us a general rule of thumb for using foldl and foldr on lists:

When the accumulation function is strict, use foldl' to consume the list in constant space, since the whole list is going to have to be traversed, anyway.

When the accumulation function is lazy in its second argument, use foldr to do work incrementally to improve streaming and work-saving.

Never use foldl or foldr'; they’re always worse on lists.

In your case, the accumulation function you’re applying is Map.delete, which is strict, so you should use foldl'.

That said, this is often a micro-optimization, so if the list is not large, it usually doesn’t really matter. It’s just a good habit to get into, and it’s worth understanding, since it’s a great example of laziness in practice.

Addendum: foldl and foldr on other data structures As a final note, you might wonder: if foldl and foldr' are so useless, why do they even exist? Why not just have foldl' and foldr?

The answer is that everything I just said only applies to lists. This behavior happens because, fundamentally, (:) is a right-associative operation, so the “remainder” of the list is on the right. But if we had snoc lists, like this:

data SnocList a = Nil | Snoc (SnocList a) a …then our lists would be left-associative, and we’d want to use foldr' in situations where we use foldl' on ordinary lists and foldl where we use foldr on ordinary lists. A little confusing, isn’t it?

Ordinary cons lists and snoc lists are basically the two extremes of foldl vs foldr, but in practice, other data structures are a lot fuzzier. For example, if you have a tree, like

data Tree a = Leaf | Branch (Tree a) a (Tree a) …then some elements are on the left and others are on the right, and neither foldl nor foldr are clearly better. In that case, if you really, really care about performance, foldMap and foldMap' are usually your best bet, since they don’t specify any particular associativity of calls to (<>). However, we don’t have foldMap' until base-4.13.0.0, which won’t be available until we switch to GHC 8.8.1. (But in truth, it probably doesn’t matter, anyway.)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment