Skip to content

Instantly share code, notes, and snippets.

@stuhlmueller
Last active February 20, 2021 03:26
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save stuhlmueller/2f639015a0703eea832c630f03e0f226 to your computer and use it in GitHub Desktop.
Save stuhlmueller/2f639015a0703eea832c630f03e0f226 to your computer and use it in GitHub Desktop.

Elicit tasks as fold and unfold

Many of the tasks we care about either expand a tree by one layer (add factors, subquestions, datasets) or summarize a tree layer into one or a few nodes (summarize, cluster, rank).

This raises two questions:

  1. Is there an existing way to characterize these tasks so that we can make an app that supports all and only the ones we want?
  2. Language models have a finite context window. How can we clearly think about tasks that operate on arbitrarily large inputs?

Technically, we can think of expansion tasks as unfolds (creating structure) and summarization-like tasks as folds (reducing structure). They're defined like this:

def fold(f, seed, lst):
    """
    Reduces a list to a value (which could be another list)
    by iteratively applying a function to each element of the 
    list and the seed value.

    f: Maps list value and seed to new seed
    seed: The state for the fold
    lst: The list to be reduced
    """
    if not lst:
        return seed
    else:
        return fold(f, f(lst[0], seed), lst[1:])


def unfold(p, f, g, seed):
    """
    Creates a list from a seed value by iteratively 
    applying a function to the seed.

    p: Determines when to stop unfolding.
    f: Maps each seed value to the corresponding list element.
    g: Maps each seed value to next seed value.
    seed: The state for the unfold.
    """
    if p(seed):
        return []
    else:
        return [f(seed)] + unfold(p, f, g, g(seed))

These functions allow us to specify a task that operates on an arbitrarily large data structure using only operations on one or two elements. Here's how to use unfold to make an arbitrarily large list of numbers:

def range(n):
    return unfold(lambda x: x == n,
                  lambda x: x,
                  lambda x: x + 1,
                  0)

None of the functions passed as arguments to unfold need to know about more than one element.

And here's how to sum a list of numbers:

def sum(xs):
    return fold(lambda i, j: i + j,
                0,
                xs)

The function passed as argument to fold only needs to know about pairs of numbers.

Language models have a finite context window and so if we want to use them to transform large inputs we need a reduction like this.

We can frame most of the workflows we care about as combinations of folds and unfolds. For example, consider this sequence:

  1. Generate relevant subquestions (unfold)
  2. Find a dataset for each subquestion (fold)
  3. Look up relevant data in each dataset (fold + unfold)
  4. Answer each subquestion given sources (fold + fold)
  5. Summarize subquestion answers into overall answer (fold)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment