Skip to content

Instantly share code, notes, and snippets.

@Gurkenglas
Created July 28, 2015 18:52
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save Gurkenglas/252f3ce87857aa3ff428 to your computer and use it in GitHub Desktop.
Save Gurkenglas/252f3ce87857aa3ff428 to your computer and use it in GitHub Desktop.
Today I shall attack the problem of implementing subsequences :: [a] -> [[a]].
I'll be recounting my thought process like I'd imagine it to be if I was a little closer in smartness to where I wish I was.
Someone on #haskell just posted some way to implement it using lets and wheres and callbacks and one-character names and more, and that problem doesn't sound so hard, so I'll try implementing it myself.
Okay, how? Each element of the sequence needs to be in half the subsequences, and not in the other half. This reminds me of the monad of nondeterminism (also called [] monad) and using it with filterM:
<Gurkenglas> :t filterM (const [True, False])
<lambdabot> [b] -> [[b]]
There. filterM makes each b in the [b] split the world in half - in one half, the filter function returns True, in the other False - and in the end all the possible resulting filtered lists - the subsequences - accumulate in the [[b]].
I post it to #haskell. Someone says that wouldn't work with infinite lists.
Hmm, true. Not even filter (const False) [1..] terminates, so the monadic version also would never output even one subsequence.
How do we deal with infinite lists? Our list of subsequences will also be infinite, so we'll have to take care that all subsequences are enumerated after some finite time. (Actually, the set of subsequences of a list is uncountably large, so we'll have to restrict ourselves to the finite subsequences of the infinite list.) One way to assure that is by putting all the subsequences that are contained in a prefix of our infinite list to the left of all subsequences not contained in that prefix.
When I make the prefix larger by one, what subsequences are newly contained in it? Each of them will contain the element newly contained in the prefix. Other than that, they'll look just like the subsequences contained in the previous prefix. Aha! This immediately points to an implementation:
foldl (flip $ ap (++) . map . (:)) [[]]
We start with the list containing only the empty subsequence. From left to right, we wander through the infinite list, adding each new element to all subsequences we already found and concatenating the old subsequence list with the new one. (At this point, I've encountered ap often enough that I've started using it on my own where it's convenient. ap (,) is also often useful!)
Adding the new element to the start of each subsequence will make each subsequence be the wrong way round at the end, but having to reverse each subsequence at the end is better than adding each new element to the far end of each subsequence.
<Gurkenglas> > map reverse . foldl (flip $ ap (++) . map . (:)) [[]] $ "asd"
<lambdabot> ["","a","s","as","d","ad","sd","asd"]
<Gurkenglas> > map reverse . foldl (flip $ ap (++) . map . (:)) [[]] $ [1..]
<lambdabot> mueval-core: Time limit exceeded
Huh? Oh. The infinite number of ++ used in (((...((([...] ++ [...]) ++ [...]) ++ [...])...) ++ [...]) ++ [...]) won't do anything to the head of the list, but the compiler doesn't know that, and keeps trying to unpack it from the infinitely nested brackets.
We'll have to arrange the lists of new subsequences such that accessing the first ones doesn't take infinitely long. One arrangement would be a list of the lists of new subsequences. How do we construct it?
In the general case, how do we get a function ([a] -> a) -> [a], that will construct its output list by, starting with [], applying the argument function to each intermediate list to get the next element? After some tinkering:
<Gurkenglas> @type \f -> fix (map f . zipWith take [0..] . repeat)
<lambdabot> ([a] -> a) -> [a]
We take our intermediate list, clone it infinitely often, get its prefixes, apply f to each, and feed the result back into the same process. For those of you who don't know fix, this is in fact only done once: The list is used to compute itself, and any thunks that refer to the list itself are resolved after the needed part of the list is done. Compare to the famous fibs = 0 : 1 : zipWith (+) fibs (tail fibs).
After much desparation over the ugly zipWith take [0..] . repeat, someone on #haskell introduces me to inits, which does exactly what I wanted.
> let foo f = fix $ map f . inits
(I would have written it foo = fix . (. inits) . map for the pointfrees, but that line came from another in the channel.)
<Gurkenglas> > foo ((+1) . product)
<lambdabot> [2,3,7,43,1807,3263443,10650056950807,113423713055421844361000443,1286493868...
Now armed with this knowledge, let's apply it to the problem! *sits down with pen and paper*
http://www11.pic-upload.de/28.07.15/x7hsyz4gkhcq.png
map concat . inits nearly gives us what we need to construct the list from itself, but the step from [] to [""] doesn't look like the others. We'll drop the head, zip that with our infinite list, and prepend the list containing only the empty subsequence.
<Gurkenglas> @let subseqs = map reverse . concat . fix . (([[]] :) .) . (. map concat . tail . inits) . zipWith (map . (:))
From right to left: take in a list and prepare to zip it with a list of lists using map . (:), which just prepends an element from the first lists to all elements of an element of the second list. Take that prepared zipping, and use it to fill in the blank in this composition:
([[]] :) . _ . map concat . tail . inits
From right to left: Take a list, turn it into a list of its prefixes, drop that [] in the front, concat each list of lists of new subsequences into a list of old subsequences, zip it using the prepared zipper, and prepend the list containing only the empty subsequence.
Continuing where we left off in subseqs: Take the now completed composition that transforms a list of lists of subsequences into a list of list of subsequences, and use fix to feed that transformer into itself, getting the complete list of lists of new subsequences, concat it to get the complete list of subsequences, don't forget to undo our reversal of each subsequence, done.
<Gurkenglas> > subseqs [1..]
<lambdabot> [[],[1],[2],[1,2],[3],[1,3],[2,3],[1,2,3],[4],[1,4],[2,4],[1,2,4],[3,4],[1,3...
Lovely! Let's follow the suggestion from #haskell and write a blog post about this. It's even greater than this one since I'm pretending to be smarter than I am.
As I write the paragraph about uncountability, I think of a way to use Cofree to solve this in yet another nice way, and resolve to add it to the end of the blog post as an appendix, solving it in real time as I write the appendix.
APPENDIX - Using ~~Cofree~~ hylo
It's a shame that we can't include the infinite subsequences in our list. But what we could do is represent them as the infinite branches of a tree! The nodes in that tree would represent the finite subsequences. Each branching would be the decision of whether to include an element in the subsequence to be constructed - the whole thing would be eerily similar to the filterM solution.
Branchings with the result of not adding an element wouldn't modify their subsequence - we should only output those subsequences contained in nodes that are the result of a "yes" branching.
~~ekmett's free package ( https://hackage.haskell.org/package/free ) contains Control.Comonad.Cofree, which conveniently does most of the work for you when you want to make trees.~~
ekmett's recursion-schemes package ( https://hackage.haskell.org/package/recursion-schemes ) contains Data.Functor.Foldable (hylo), which I've lately discovered to be applicable to a broad range of problems and been trying to get better with it since! I mean, just look at its type signature!
hylo :: Functor f => (f b -> b) -> (a -> f a) -> a -> b
Amazing.
(a -> f a) will be representing a branching. Given a position in the infinite sequence and an intermediate subsequence, we will be deciding whether or not to add the corresponding element in the sequence to our subsequence. Of course, we'll be doing both, but by the magic of nondeterminism we can pretend we chose one.
unfold :: (Subsequence, Sequence) -> f (Subsequence, Sequence)
Of course, we still have to decide what f is. Each branching results in two new branches, and we want to remember our new subsequence too, so the definition goes like this:
<Gurkenglas> @let data Branching s a = Branching s a a deriving (Functor, Show)
<lambdabot> Defined.
<Gurkenglas> > fmap (*2) $ Branching 4 5 6
<lambdabot> Branching 4 10 12
(I won't be using Show, but I needed it for those last two lines.)
unfold :: ([a], [a]) -> Branching [a] ([a], [a])
unfold (bus, x:xs) = Branching (reverse (x:bus)) (x:bus, xs) (bus, xs)
(f b -> b) will be representing us collecting the resulting subsequences from each branch and combining them such that our final list will mention all finite subsequences at some point.
fold :: f [Subsequence] -> [Subsequence]
Since we aren't modifying the subsequences during the collection, I'm not even gonna tell fold's type signature what it is dealing with.
fold :: Branching a [a] -> [a]
fold (Branching sub subsyes subsno) = (sub :) $ concat $ zipWith (\x y -> [x, y]) subsyes subsno
Drop the unchanged first subsequence from the no branch, zip-merge the lists from the two branches into one, prepend our own subsequence.
subseqs :: [a] -> [[a]]
subseqs = ([] :) . hylo L.fold unfold . (,) []
Decorate the (hopefully - we haven't handled the finite case) infinite input sequence with the empty subsequence, refold it with hylo, and prepend the empty subsequence because we only ever kept track of new ones, so the first one went forgotten.
<Gurkenglas> @undefine
<lambdabot> Undefined.
<Gurkenglas> @letlpaste 137499
<lambdabot> Defined.
<Gurkenglas> > subseqs [1..]
<lambdabot> [[1],[1,2],[2],[1,2,3],[2,3],[1,3],[3],[1,2,3,4],[2,3,4],[1,3,4],[3,4],[1,2,4],[2,4],[1,4],[4],[1,2,3,4,5],[2,3,4,5],[1,3,4,5],[3,4,5],[1,2,4,5],[2,4,5],[1,4,5],[4,5],[1,2,3,5],[2,3,5],[1,3,5],[3,5],[1,2,5],[2,5],[1,5],[5],[1,2,3,4,5,6],[2,3,4,5,6],[1,3,4,5,6],[3,4,5,6],[1,2,4,5,6],[2,4,5,6],[1,4,5,6],[4,5,6],[1,2,3,5,6],[2,3,5,6],[1,3,5,6],[3,5,
<lambdabot> 6],[1,2,5,6],[2,5,6],[1,5,6],[5,6],[1,2,3,4,6],[2,3,4,6],[1,3,4,6],[3,4,6],[1,2,4,6],[2,4,6],[1,4,6],[4,6],[1,2,3,6],[2,3,6],[1,3,6],[3,6],[1,2,6],[2,6],[1,6],[6],[1,2,3,4,5,6,7],[2,3,4,5,6,7],[1,3,4,5,6,7],[3,4,5,6,7],[1,2,4,5,6,7],[2,4,5,6,7],[1,4,5,6,7],[4,5,6,7],[1,2,3,5,6,7],[2,3,5,6,7],[1,3,5,6,7],[3,5,6,7],[1,2,5,6,7],[2,5,6,7],[1,5,6,7],[
<lambdabot> 5,6,7],[1,2,3,4,6,7],[2,3,4,6,7],[1,3,4,6,7],[3,4,6,7],[1,2,4,6,7],[2,4,6,7],[1,4,6,7],[4,6,7],[1,2,3,6,7],[2,3,6,7],[1,3,6,7],[3,6,7],[1,2,6,7],[2,6,7],[1,6,7],[6,7],[1,2,3,4,5,7],[2,3,4,5,7],[1,3,4,5,7],[3,4,5,7],[1,2,4,5,7],[2,4,5,7],[1,4,5,7],[4,5,7],[1,2,3,5,7],[2,3,5,7],[1,3,5,7],[3,5,7],[1,2,5,7],[2,5,7],[1,5,7],[5,7],[
Looks good.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment