Last active
March 16, 2020 09:11
-
-
Save treeowl/b7d5daee549eb9bc6e6f2d9f851aab8c to your computer and use it in GitHub Desktop.
Breadth-first binary tree creation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
-- | This module defines a function that produces a complete binary tree | |
-- from a breadth-first list of its (internal) node labels. It is an | |
-- optimized version of an implementation by Will Ness that avoids | |
-- any "impossible" cases. See | |
-- | |
-- https://stackoverflow.com/a/60561480/1477667 | |
module Bftr (Tree (..), bft, list, deftest) where | |
import Data.Function (fix) | |
import Data.Monoid (Endo (..)) | |
-- | A binary tree. | |
data Tree a | |
= Empty | |
| Node a (Tree a) (Tree a) | |
deriving Show | |
-- An infinite stream. Why don't we just use a list? First, I think it's | |
-- easier to see what's going on this way. Second, the garbage collector | |
-- can clean up more garbage with this representation. The `bft` function | |
-- includes several lazy patterns. These cause the runtime system to | |
-- produce thunks to actually perform the pattern matches. Unlike a list | |
-- cons, a stream cons is a *record*, so the runtime system will produce | |
-- *selector thunks* for these pattern matches. That means that as soon | |
-- as one field of a particular cons is forced, all the selector thunks | |
-- matching on *either* field become available for GC simplification. | |
data SS a = a :< SS a | |
infixr 5 :< | |
bft' :: [a] -> Tree a | |
bft' xs = tree | |
where | |
-- `subtrees` is a stream of all proper subtrees of the result tree, | |
-- in breadth-first order, followed by infinitely many empty trees. | |
-- We form each tree in the result by combining a label from the input | |
-- list with consecutive subtrees. | |
tree :< subtrees = go xs subtrees | |
go :: [a] -> SS (Tree a) -> SS (Tree a) | |
go (a : as) ~(b1 :< ~(b2 :< bs)) = Node a b1 b2 :< go as bs | |
go [] _ = fix (Empty :<) | |
-- Ugh. The above definition, while beautiful, can leak space, thanks | |
-- to limitations of GHC's selector thunk mechanism. To avoid this, | |
-- at the cost of efficiency in some cases, we write this instead. | |
-- | Build a complete binary tree from a list of its breadth-first | |
-- traversal. | |
bft :: [a] -> Tree a | |
bft xs = tree | |
where | |
-- `subtrees` is a stream of all proper subtrees of the result tree, | |
-- in breadth-first order, followed by infinitely many empty trees. | |
-- We form each tree in the result by combining a label from the input | |
-- list with consecutive subtrees. | |
tree :< subtrees = go xs subtrees | |
go :: [a] -> SS (Tree a) -> SS (Tree a) | |
go (a : as) ys = Node a b1 b2 :< go as bs | |
where | |
{-# NOINLINE b2bs #-} | |
b1 :< b2bs = ys | |
b2 :< bs = b2bs | |
go [] _ = fix (Empty :<) | |
-- | Perform a simple laziness test; takes a non-negative integer. | |
-- This should not throw an exception. | |
deftest :: Int -> Int | |
deftest n = sum . take n . list . bft $ [1..n] ++ undefined | |
-- | Convert a tree to a list in breadth-first order (useful for testing). | |
-- | |
-- @list . bft = id@ | |
-- | |
-- When @t@ is a complete tree, | |
-- | |
-- @bft . list $ t = t@ | |
list :: Tree a -> [a] | |
-- there's probably a better way to do this. | |
list t = appEndo (foldMap id (levels t)) [] | |
levels :: Tree a -> [Endo [a]] | |
levels Empty = [] | |
levels (Node a l r) = Endo (a:) : combine (levels l) (levels r) | |
where | |
combine [] ys = ys | |
combine (x : xs) ys = (x <> hd) : combine xs tl | |
where | |
(hd, tl) = case ys of | |
[] -> (mempty, []) | |
y:ys' -> (y, ys') |
I haven't followed the whole conversation here, but for breadth-first traversals I think the optimal function is the following:
levels :: Tree a -> [[a]]
levels t = takeWhile (not . null) (f t (repeat []))
where
f Empty qs = qs
f (Node x ls rs) ~(q : qs) = (x:q) : f ls (f rs qs)
It would probably benefit from some optimisation, like using streams instead of lists:
data Stream a = a :< Stream a
toListWhile p (x :< xs)
| p x = x : toListWhile p xs
| otherwise = []
levels :: Tree a -> [[a]]
levels t = toListWhile (not . null) (f t (fix ([] :<)))
where
f Empty qs = qs
f (Node x ls rs) ~(q :< qs) = (x:q) :< f ls (f rs qs)
It also looks the most like the "inverse" of the unfold.
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
no, I replied to the comment directly above mine. I tried to summarize the workings of your
go
function there.is "rotate the queue" common terminology for the switching it describes? for my mind,
rotate (x:xs) == xs++[x]
.also, I'd name
b
's argumentbw
, for uniformity and clarity. I can already see it's[Tree a]
, so calling itts
doesn't add any new clues; but seeing that it's the samebw
as inf
clarifiesf
's definition for me. Or, we could get rid ofb
altogether withmy
bflist
can be made to skip over the Empty's ahead of time, at the cost of putting more clauses into its definition:Might perform better than the shorter version.