Skip to content

Instantly share code, notes, and snippets.

@treeowl
Last active March 16, 2020 09:11
Show Gist options
  • Save treeowl/b7d5daee549eb9bc6e6f2d9f851aab8c to your computer and use it in GitHub Desktop.
Save treeowl/b7d5daee549eb9bc6e6f2d9f851aab8c to your computer and use it in GitHub Desktop.
Breadth-first binary tree creation
-- | This module defines a function that produces a complete binary tree
-- from a breadth-first list of its (internal) node labels. It is an
-- optimized version of an implementation by Will Ness that avoids
-- any "impossible" cases. See
--
-- https://stackoverflow.com/a/60561480/1477667
module Bftr (Tree (..), bft, list, deftest) where
import Data.Function (fix)
import Data.Monoid (Endo (..))
-- | A binary tree.
data Tree a
= Empty
| Node a (Tree a) (Tree a)
deriving Show
-- An infinite stream. Why don't we just use a list? First, I think it's
-- easier to see what's going on this way. Second, the garbage collector
-- can clean up more garbage with this representation. The `bft` function
-- includes several lazy patterns. These cause the runtime system to
-- produce thunks to actually perform the pattern matches. Unlike a list
-- cons, a stream cons is a *record*, so the runtime system will produce
-- *selector thunks* for these pattern matches. That means that as soon
-- as one field of a particular cons is forced, all the selector thunks
-- matching on *either* field become available for GC simplification.
data SS a = a :< SS a
infixr 5 :<
bft' :: [a] -> Tree a
bft' xs = tree
where
-- `subtrees` is a stream of all proper subtrees of the result tree,
-- in breadth-first order, followed by infinitely many empty trees.
-- We form each tree in the result by combining a label from the input
-- list with consecutive subtrees.
tree :< subtrees = go xs subtrees
go :: [a] -> SS (Tree a) -> SS (Tree a)
go (a : as) ~(b1 :< ~(b2 :< bs)) = Node a b1 b2 :< go as bs
go [] _ = fix (Empty :<)
-- Ugh. The above definition, while beautiful, can leak space, thanks
-- to limitations of GHC's selector thunk mechanism. To avoid this,
-- at the cost of efficiency in some cases, we write this instead.
-- | Build a complete binary tree from a list of its breadth-first
-- traversal.
bft :: [a] -> Tree a
bft xs = tree
where
-- `subtrees` is a stream of all proper subtrees of the result tree,
-- in breadth-first order, followed by infinitely many empty trees.
-- We form each tree in the result by combining a label from the input
-- list with consecutive subtrees.
tree :< subtrees = go xs subtrees
go :: [a] -> SS (Tree a) -> SS (Tree a)
go (a : as) ys = Node a b1 b2 :< go as bs
where
{-# NOINLINE b2bs #-}
b1 :< b2bs = ys
b2 :< bs = b2bs
go [] _ = fix (Empty :<)
-- | Perform a simple laziness test; takes a non-negative integer.
-- This should not throw an exception.
deftest :: Int -> Int
deftest n = sum . take n . list . bft $ [1..n] ++ undefined
-- | Convert a tree to a list in breadth-first order (useful for testing).
--
-- @list . bft = id@
--
-- When @t@ is a complete tree,
--
-- @bft . list $ t = t@
list :: Tree a -> [a]
-- there's probably a better way to do this.
list t = appEndo (foldMap id (levels t)) []
levels :: Tree a -> [Endo [a]]
levels Empty = []
levels (Node a l r) = Endo (a:) : combine (levels l) (levels r)
where
combine [] ys = ys
combine (x : xs) ys = (x <> hd) : combine xs tl
where
(hd, tl) = case ys of
[] -> (mempty, [])
y:ys' -> (y, ys')
@WillNess
Copy link

WillNess commented Mar 14, 2020

no, I replied to the comment directly above mine. I tried to summarize the workings of your go function there.

is "rotate the queue" common terminology for the switching it describes? for my mind, rotate (x:xs) == xs++[x].

also, I'd name b's argument bw, for uniformity and clarity. I can already see it's [Tree a], so calling it ts doesn't add any new clues; but seeing that it's the same bw as in f clarifies f's definition for me. Or, we could get rid of b altogether with

breadthFirst :: forall a. [Tree a] -> [a]
breadthFirst [] = []
breadthFirst ts = foldr f (breadthFirst . reverse) ts []
  where
    f (Node x l r) fw bw = x : fw (r : l : bw)
    f Empty        fw bw =     fw bw

my bflist can be made to skip over the Empty's ahead of time, at the cost of putting more clauses into its definition:

bflist2 :: Tree a -> [a]
bflist2 Empty =  []
bflist2 t     =  map (\ ~(Node x _ _) -> x) q
     where
     q  =  t : go 1 q
     go 0  _                 =  []
     go i (Node _ Empty Empty : ns)  =          go (i-1) ns
     go i (Node _ l     Empty : ns)  =  l     : go (i)   ns
     go i (Node _ Empty r     : ns)  =      r : go (i)   ns
     go i (Node _ l     r     : ns)  =  l : r : go (i+1) ns

Might perform better than the shorter version.

@oisdk
Copy link

oisdk commented Mar 14, 2020

I haven't followed the whole conversation here, but for breadth-first traversals I think the optimal function is the following:

levels :: Tree a -> [[a]]
levels t = takeWhile (not . null) (f t (repeat []))
  where
    f Empty                qs  = qs
    f (Node x ls rs) ~(q : qs) = (x:q) : f ls (f rs qs)

It would probably benefit from some optimisation, like using streams instead of lists:

data Stream a = a :< Stream a

toListWhile p (x :< xs)
  | p x       = x : toListWhile p xs
  | otherwise = []

levels :: Tree a -> [[a]]
levels t = toListWhile (not . null) (f t (fix ([] :<)))
  where
    f Empty                 qs  = qs
    f (Node x ls rs) ~(q :< qs) = (x:q) :< f ls (f rs qs)

It also looks the most like the "inverse" of the unfold.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment