Skip to content

Instantly share code, notes, and snippets.

@vasalf
Last active April 8, 2019 17:30
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save vasalf/c246171d8ebd784487624853566b8df2 to your computer and use it in GitHub Desktop.
Save vasalf/c246171d8ebd784487624853566b8df2 to your computer and use it in GitHub Desktop.
Google Summer of Code Haskell Proposal. Note: there are three compilable Haskell files: one for each goal.
module AcyclicGraph where
import qualified Algebra.Graph
import Control.Applicative
import Control.Monad
import Data.List (find, union)
import Data.Maybe (fromJust)
{- AcyclicGraph definition
~~~~~~~~~~~~~~~~~~~~~~~~~~
Please note that this code is written in order to demonstrate the idea of how to represent
algebraic acyclic graphs. Although this code is compilable, it is not tested properly. It also
might be not very efficient and some of its parts might be redudant. I tried to point out such
places.
The idea behind this class is to split the graph into "layers" at the constructor level. If some
pair of vertices store equal values but belong to different levels, they are assumed as different.
The edges are forced to end at latter level then they begin, which gives a guarantee for graph
being acyclic.
-}
data AcyclicGraph a = Empty
| Vertex a
| Overlay (AcyclicGraph a) (AcyclicGraph a)
| Connect [a] (AcyclicGraph a)
| Shift (AcyclicGraph a)
deriving (Eq, Show)
{-
The "Empty" constructor constructs an empty graph.
The "Vertex" constructor constructs a graph of one layer with one vertex at this layer.
The "Overlay" constructor constructs a graph with union of vertices of both graphs at each level
and union of edges from both graph between levels.
The "Connect" constructor differs from its analogue from alga. It creates a graph with given list
of vertices at the first level, all other vertices took from given graph with layers shifted by
one, all edges from the given graph and all edges between the first layer and every vertex of every
other layer.
The "Shift" constructor creates a graph with an empty first layer and all other vertices from given
graph with layers shifted by 1.
For example, an acyclic graph on 4 vertices with edges "1->2", "1->3", "1->4", "2->4" and "3->4"
might be represented as
Connect [1] $ Connect [2, 3] $ Vertex 4
As well as
Overlay (Shift $ Connect [2, 3] $ Vertex 4) $ Overlay (Connect [1] $ Overlay (Vertex 2) (Vertex 3)) $ Connect [1] $ Shift $ Vertex 4
Note that "Shift" and "Vertex" constructors are actually redudant:
Shift = Connect []
Vertex x = Connect [x] Empty
I added the "Shift" constructor in order to (possibly) replace it with "Shift Int" constructor.
The "Vertex" constructor is added for better usability.
In order to demonstrate the abilities of such definition, I drafted the instances for the same
classes as they are defined for Algebra.Graph.Graph.
We can declare the "foldg" function in the same fashion as it is in Algebra.Graph:
-}
foldg :: b -> (a -> b) -> (b -> b -> b) -> ([a] -> b -> b) -> (b -> b) -> AcyclicGraph a -> b
foldg e v o c s = go
where
go Empty = e
go (Vertex x) = v x
go (Overlay x y) = o (go x) (go y)
go (Connect x y) = c x $ go y
go (Shift x) = s $ go x
{-
This function is useful when transforming graphs, we will use it later.
An important property of acyclic graphs is that we can easily do any dinamic programming on them.
We demonstrate such ability with writing another analogue of fold which returns value of dynamic
programming for each vertex. Note that this function could be implemented in more effective way if
we were using maps instead of lists, but it seems enough for demonstration.
-}
vertexList :: Eq a => AcyclicGraph a -> [(a, Int)]
vertexList = foldg [] v union c s
where
-- v :: a -> [(a, Int)]
v x = [(x, 0)]
-- c :: [a] -> [(a, Int)] -> [(a, Int)]
c x y = (concat $ map v x) ++ (s y)
-- s :: [(a, Int)] -> [(a, Int)]
s = map (\(x, y) -> (x, y + 1))
foldDp :: Eq a => (a -> b) -> (b -> b -> b) -> (b -> b -> b) -> AcyclicGraph a -> [((a, Int), b)]
foldDp init update updateEdge g
| null (vertexList g) = []
| otherwise = let resTail = foldDp init update updateEdge (cropFirst g)
in (buildFirst g resTail) ++ (shiftLevel resTail)
where
-- buildFirst :: AcyclicGraph a -> [((a, Int), b)] -> [((a, Int), b]
buildFirst Empty _ = []
buildFirst (Vertex v) _ = [((v, 0), init v)]
buildFirst (Overlay x y) t = unite (buildFirst x t) (buildFirst y t)
buildFirst (Connect a x) t = connect init a x t
buildFirst (Shift g) _ = []
-- findValue :: [((a, Int), b)] -> (a, Int) -> ((a, Int), b)
findValue lst x = fromJust $ find (\y -> fst y == x) lst
-- updateValue :: [((a, Int), b)] -> (b -> b -> b) -> ((a, Int), b) -> [((a, Int), b)]
updateValue u [] v = [v]
updateValue u (((x1,y1),z1):vs) ((x2,y2),z2)
| (x1, y1) == (x2, y2) = ((x1, y1), u z1 z2):vs
| otherwise = ((x1, y1), z1):(updateValue u vs ((x2, y2), z2))
-- unite :: [((a, Int), b)] -> [((a, Int), b)] -> [((a, Int), b)]
unite ls [] = ls
unite ls (a:as) = unite (updateValue update ls a) as
-- initVals :: (a -> b) -> [a] -> [((a, Int), b)]
initVals init = map (\x -> ((x, 0), init x))
-- connect :: b -> [a] -> AcyclicGraph a -> [((a, Int), b)] -> [((a, Int), b)]
connect init as g t = foldr (combine g) (initVals init as) as
where
-- combine :: AcyclicGraph a -> a -> [((a, Int), b)] -> [((a, Int), b)]
combine gr v cur = foldr (combineInner v) cur (map (findValue t) $ vertexList gr)
-- combineInner :: a -> ((a, Int), b) -> [((a, Int), b)] -> [((a, Int), b)]
combineInner v w cur = updateValue updateEdge cur ((v, 0), (snd w))
-- cropFirst :: AcyclicGraph a -> AcyclicGraph a
cropFirst Empty = Empty
cropFirst (Vertex v) = Empty
cropFirst (Overlay x y) = Overlay (cropFirst x) $ cropFirst y
cropFirst (Connect x y) = y
cropFirst (Shift x) = x
-- shiftLevel :: [(a, Int), b] -> [(a, Int), b]
shiftLevel = map shiftOnce
where
shiftOnce ((x, y), z) = ((x, y + 1), z)
maxDepth :: Eq a => AcyclicGraph a -> Int
maxDepth g = foldr max 0 $ map snd $ foldDp (const 0) max (flip $ max . (+1)) g
{-
The "maxDepth" function demonstates abilities of "foldDp" functions. First three arguments of
"findDp" are:
init :: a -> b -- Function that initializes values of Dynamic Programming.
update :: b -> b -> b -- Function that returns value of Dynamic Programming given the values in
-- two overlaying graphs.
updateEdge :: b -> b -> b -- Function that returns value of Dynamic Programming updated by edge
-- from vertex with value equal to the first argument to vertex with
-- value equal to last argument.
In "maxDepth" we do a classic Dynamic Programming algorithm on acycling graph: we find length of
the longest path in the graph.
As you can see, we crop the first layer inside this function, then we recursively find the Dynamic
Programming values in the next layers and then we build values for the first layer.
This implementation actually seems to work in N^2 time, but it can be improved with using faster
data structures for storing the dynamic programming values instead of lists.
We will use the next few functions in instances of classes below. This might not be the most
efficient way to organize the graph: the "Shift" constructor might be replaced with "Shift Int"
taking number of empty levels to be inserted. Still, this way seems a bit more readable.
-}
shiftN :: Int -> AcyclicGraph a -> AcyclicGraph a
shiftN 0 g = g
shiftN n g = Shift $ shiftN (n - 1) g
fromShiftN :: Int -> AcyclicGraph a -> AcyclicGraph a
fromShiftN 0 g = g
fromShiftN n (Shift g) = fromShiftN (n - 1) g
layerMaxDepth :: AcyclicGraph a -> Int
layerMaxDepth = foldg 0 (const 1) max (flip $ const . (+1)) (+1)
{-
Currently we cannot make up the Num instance for our graph because we don't have multiplication yet.
Here we create its analogue which shifts the layers in the second argument by number of layers in
the first and connects each vertex of the first graph to each vertex of the second. Note that this
operation is associative but not commutative.
-}
connect :: AcyclicGraph a -> AcyclicGraph a -> AcyclicGraph a
connect g h = go (layerMaxDepth g) g h
where
go :: Int -> AcyclicGraph a -> AcyclicGraph a -> AcyclicGraph a
go s Empty h = Empty
go s (Vertex v) h = Connect [v] $ shiftN s h
go s (Overlay x y) h = Overlay (go s x h) (go s y h)
go s (Connect x y) h = Overlay (Connect x (shiftN s h)) $ go (s - 1) y h
go s (Shift x) h = Shift $ go (s - 1) x h
{-
Now we can make a Num instance for our type.
-}
instance (Eq a, Num a) => Num (AcyclicGraph a)
where
fromInteger = Vertex . fromInteger
(+) = Overlay
(*) = connect
signum = const Empty
abs = id
negate = id
{-
Functor instance does not differ much from one for Algebra.Graph.
-}
instance Functor AcyclicGraph where
fmap f = foldg Empty (Vertex . f) Overlay (\x y -> Connect (f <$> x) y) Shift
{-
Speaking of Applicative instance, we are supposed to follow the Applicative functor laws. A natural
way to do the sequential application is to create a copy of first argument graph instead of each
vertex of the second argument and apply each function in it to the values in the second graph. As I
understand, this is the way it is done in alga.
-}
instance Applicative AcyclicGraph where
pure = Vertex
Empty <*> g = Empty
(Vertex f) <*> g = f <$> g
(Overlay a b) <*> g = Overlay (a <*> g) (b <*> g)
-- Not a very efficient one, but this must work. We create a new graph for each vertex on the
-- list and then overlay the resulting graphs.
(Connect x b) <*> g = foldr Overlay Empty (makeGraph <$> x)
where
makeGraph f = connect (f <$> g) (b <*> g)
(Shift f) <*> g = Shift $ f <*> g
{-
Monad instance is a tricky one. The natural way to define ">>=" is to replace each vertex of the
first argument with graph returned by the second. The problem is that the graphs returned for
different vertices of the same level might have different depths. Then, different vertices of
different graphs might be assumed as equal if they have same value, which is not naturally
supposed.
The solution is to shift all levels in the graph by maximum depth of returned graph of the previous
level, and then do the similar thing.
We didn't face this problem with "<*>" because the graphs there were guaranteed to have the same
maximum depth.
-}
instance Monad AcyclicGraph where
return = pure
g >>= f = helper (layerShift g $ maxLayerDepths f g) f
where
-- helper :: AcyclicGraph a -> (a -> AcyclicGraph b) -> AcyclicGraph b
helper Empty f = Empty
helper (Overlay a b) f = Overlay (helper a f) (helper b f)
helper (Connect x b) f = foldr Overlay Empty (makeGraph <$> x)
where
makeGraph a = let g = f a
in connect g $ fromShiftN (layerMaxDepth g) $ helper b f
helper (Shift g) f = Shift $ helper g f
maxLayerDepths :: (a -> AcyclicGraph b) -> AcyclicGraph a -> [Int]
maxLayerDepths f = foldg [] v o c s
where
-- v :: a -> [Int]
v = pure . layerMaxDepth . f
-- This one can be actually replaced with sequential application for ZipLists.
-- o :: [Int] -> [Int] -> [Int]
o (x:xs) [] = x:(o xs [])
o [] (y:ys) = y:(o [] ys)
o (x:xs) (y:ys) = (max x y):(o xs ys)
-- c :: [a] -> [Int] -> [Int]
c as x = (foldr max 0 (layerMaxDepth . f <$> as)):x
-- s :: [Int] -> [Int]
s ds = 0:ds
layerShift :: AcyclicGraph a -> [Int] -> AcyclicGraph a
layerShift Empty xs = Empty
layerShift (Vertex v) xs = Vertex v
layerShift (Overlay x y) xs = Overlay (layerShift x xs) $ layerShift y xs
layerShift (Connect t g) (x:xs) = Connect t $ shiftN x $ layerShift g xs
layerShift (Shift g) (x:xs) = shiftN x $ layerShift g xs
{-
Nothing special for Alternate and MonadPlus, just the same as for Algebra.Graph
-}
instance Alternative AcyclicGraph where
empty = Empty
(<|>) = Overlay
instance MonadPlus AcyclicGraph where
mzero = empty
mplus = (<|>)
module AdvancedGraphAlgorithms where
import Data.List
import qualified Data.Map.Strict as MS
{- (optional) Advanced Graph Algorithms
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Network Flows
~~~~~~~~~~~~~
I shall start with the idea from the ideas least: network maximum flows. The reason I choose it for
the proposal is that my definition for AcyclicGraph data type above particullary fits into the idea
of Dinic's algorithm, which is one of the fastest algorithm for maximum flow that is applicable on
practice. Indeed, in Dinic's algorithm we split the given graph into levels by the shortest
distance and ignore any edges that doesn't go from some level to the next, and then we try to push
some flow through those edges.
Unfortunatelly, as the depth-first search in Dinic's algorithm implementation is of very special
kind, we can't just push the maximum flow by building the graph and calling foldDp function from it.
But anyway we can do it: the correct approach to find many paths for pushing the flow would be to
build an adjacency map and do a depth-first search on it, either finding the way to push the flow
through or finding that there is no more way. In first case we need to delete the edge from
temporary adjacency map.
The resulting algorithm works in O(pVE log V) with O(log v) overhead for using adjacency map, where
p is number of phases which is proved to be not greater then V and has many proved bounds for
special graphs, for example, the ones proved by Karzanov.
Minimum-Cost Flows
~~~~~~~~~~~~~~~~~~
Well, if we find the shortest paths and the maximum flows, why don't we find minimum-cost flows?
The classical approach starts with getting rid of negative cycles. It is the most time-consumable
part of the algorithm, and if there are none, it is a great boost for the algorithm. To deal with
negative cycles, we find them with Bellman-Ford algorithm, then we push the maximum flow through
the cycle with one depth-first search.
Then we are guaranteed not to have negative cycles, so we find the minimum cost flow. To do so, we
launch the Johnson algorithm to find the shortest path and then we push maximum flow through this
path.
From now it seems that the minumim-cost flow algorithms is a simple combination of algorithms that
are already implemented to that point with two small depth-first searches to push the flow.
Algorithms for bipartite graphs
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Yes, there are plenty, and someone **might** find them useful.
We start with alga-style definition for bipartite graph.
-}
data BipartiteGraph a b = Empty
| LVertex a
| RVertex b
| Overlay (BipartiteGraph a b) (BipartiteGraph a b)
| Connect (BipartiteGraph a b) (BipartiteGraph a b)
deriving Show
{-
Here, vertices of left and right parts might have different types. The "Overlay" constructor does
the very same thing as in alga, and the "Connect" constructor additionaly creates an edge from
every left vertex of the first argument to every right vertex of the second argument and vice-versa.
I could try to make any instances for basic data clases (which seems rather interesting: for
example, `fmap` would be applied only to the right part), but my purpose is to show how do I see
the basic bipartite graph algorithms for this data class.
Maximum bipartite matching
~~~~~~~~~~~~~~~~~~~~~~~~~~
There are two approaches: Kuhn's algorithm in O(|M|E) and Hopcroft-Karp's algorithm in O(Esqrt(V)).
Not only Kuhn's algorithm might be more efficient on special graphs, but also some parts of it are
needed in another algorithms.
Kuhn simply tries to find an augmenting path with depth-first search. If it doesn't manage to, it
is known that current matching is of maximum size.
And Hopcroft-Karp is just an application of Dinic's algorithm to the maximum bipartite matching
problem. It has some optimized implementations, but basically, it is. Actually, Kuhn's algorithm is
only application of Ford-Fulkerson algorithm.
So, if we have network flows, we can easily find maximum biparite matching. I see the function
definition as below:
-}
maximumMatching :: (Ord a, Ord b) => BipartiteGraph a b -> MS.Map a b
maximumMatching = undefined
-- And if we want to know neighbours of _each_ vertex
maximumMatchingLR :: (Ord a, Ord b) => BipartiteGraph a b -> (MS.Map a b, MS.Map b a)
maximumMatchingLR = undefined
{-
Bipartite Vertex Cover and Independent Set
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
And when we have the maximum mathching algorithms, we can use Kőnig-Egerváry's theorem to find the
minimum Vertex Cover and the maximum Independent Set (which is simply complement to the Vertex
Cover). To find the minimum Vertex Cover, we launch the depth-first search from Kuhn's algorithm
and take only unvisited vertices from the left part and visited vertices from the right. As we need
to keep track of visited vertices in Kuhn's algorithm, this seems to be just usage of Kuhn's
algorithm for finding augmenting paths.
I see the function definitions as below:
-}
leftVertexList :: (Ord a, Ord b) => BipartiteGraph a b -> [a]
leftVertexList = undefined
rightVertexList :: (Ord a, Ord b) => BipartiteGraph a b -> [b]
rightVertexList = undefined
minimumVertexCover :: (Ord a, Ord b) => BipartiteGraph a b -> [Either a b]
minimumVertexCover = undefined
maximumIndependentSet :: (Ord a, Ord b) => BipartiteGraph a b -> [Either a b]
maximumIndependentSet g = ((Left <$> leftVertexList g) ++ (Right <$> rightVertexList g)) \\ minimumVertexCover g
{-
The point of this document was to show that there are some algorithms that can be easily coded in
case all of the idea description is coded (including maximum flow algorithms). With only few
additional depth-first searches we can implement many new algorithms. I didn't draft them as all
non-optional algorithms but I suppose them to be coded easily.
By the way, at least I would be interested in alga library if it had Hopcroft-Karp algorithm
implementation. A short time ago I was searching for some library implementation to benchmark my
own code with it and the only one I found was written in Python and it was rather slow because of
that. I'd love to use a library with many graph algorithms implemented.
-}
module WeightAlgorithms where
import Algebra.Graph.Labelled
import Data.Maybe
import Data.List
import qualified Data.Map.Strict
import qualified Data.Set
import qualified Algebra.Graph.Labelled.AdjacencyMap as AM
{- Shortest paths algorithms
~~~~~~~~~~~~~~~~~~~~~~~~~~~~
I'll start with Bellman-Ford algorithm for it is quite simple to implement it in functional style.
As long as this is only a draft version, this code is still not properly tested and might be not
efficient enough.
-}
-- Next two subroutines are useful for the demos to store/change the current return value in lists.
findValue :: Eq a => a -> [(a, Maybe e)] -> Maybe e
findValue x l = snd $ fromJust $ find (\v -> fst v == x) l
update :: (Eq a, Ord e) => a -> e -> [(a, Maybe e)] -> [(a, Maybe e)]
update x d l = map upd l
where
upd (v, dpVal)
| v /= x = (v, dpVal)
| Nothing <- dpVal = (v, Just d)
| Just o <- dpVal = (v, Just $ min o d)
shortestPathFordBellman :: (Monoid e, Ord e, Ord a) => a -> Graph e a -> [(a, Maybe e)]
shortestPathFordBellman v g = helper v (edgeList g) ((,) <$> vertexList g <*> pure Nothing)
where
-- helper :: a -> [(e, a, a)] -> [(a, Maybe e)] -> [(a, Maybe e)]
helper v es dp = go ((length dp) - 1) v es dp
where
-- go :: Int -> a -> [(e, a, a)] -> [(a, Maybe e)] -> [(a, Maybe e)]
go 0 v _ l = update v mempty l
go n v es l = let prevLayer = go (n - 1) v es l
in foldr upd prevLayer es
where
-- upd :: (e, a, a) -> [(a, Maybe e)] -> [(a, Maybe e)]
upd (e, a, b) dp = case findValue a dp of
Nothing -> dp
Just d -> update b (d `mappend` e) dp
{-
This one seems to work on small examples, which seems enough for the demonstration. The function
takes a vertex and a graph and returns vector of shortest paths from the vertex to all other
vertices (or Nothing if there's no path). This doesn't work in O(VE) as it is supposed to because
we use lists to store the dynamic programming values and update them in linear time, so it is
actually O(V^2E). This might be fixed by using some built-in maps.
Another common algorithm for finding shortest paths in graphs is Floyd-Warshall's algorithm, which
finds the shortest path from each vertex to each other. It can be coded in the same fashion, with
the main difference that "go" function will be iterating through vertices and doing two nested
folds of the array of vertices inside. It seems that "naive" implementation of this algorithm,
which stores the answer in a 2d-array and updates the current value in linear time would work in
O(V^4), which doesn't seem very useful anyway. Thus, I skip this algorithm in this demo.
The next common algorithm for finding shortest paths in graphs is Dijkstra's algorithm. Here I
draft a V^2-time version:
-}
toAdjacencyMap :: (Eq e, Monoid e, Ord a) => Graph e a -> AM.AdjacencyMap e a
toAdjacencyMap = foldg AM.empty AM.vertex AM.connect
shortestPathDijkstra :: (Monoid e, Ord e, Ord a) => a -> Graph e a -> [(a, Maybe e)]
shortestPathDijkstra v g = helper v (toAdjacencyMap g) Data.Set.empty $ initial v mempty g
where
-- helper :: a -> AM.AdjacencyMap e a -> Data.Set.Set a -> [(a, Maybe e)] -> [(a, Maybe e)]
helper v mp s l
| Just (u, _) <- selectNext s l = helper v mp (Data.Set.insert u s) $ updateOnce (AM.adjacencyMap mp) u l
| otherwise = l
initial :: (Monoid e, Ord e, Ord a) => a -> e -> Graph e a -> [(a, Maybe e)]
initial v e g = update v e $ (,) <$> vertexList g <*> pure Nothing
-- selectNext :: Data.Set.Set a -> [(a, Maybe e)] -> Maybe (a, e)
selectNext s = foldr (combine s) Nothing
where
-- combine :: Data.Set.Set a -> (a, Maybe e) -> Maybe a -> Maybe (a, e)
combine s (vertex, value) cur
| vertex `Data.Set.member` s = cur
| Nothing <- value = cur
| Just v <- value, Nothing <- cur = Just (vertex, v)
| Just v <- value, v < (snd $ fromJust cur) = Just (vertex, v)
| otherwise = cur
-- updateOnce :: Data.Map.Strict.Map a (Data.Map.Strict.Map a e) -> a -> [(a, Maybe e)] -> [(a, Maybe e)]
updateOnce mp v l = Data.Map.Strict.foldrWithKey (upd (fromJust $ findValue v l)) l (fromJust $ Data.Map.Strict.lookup v mp)
where
-- upd :: e -> a -> e -> [(a, Maybe e)] -> [(a, Maybe e)]
upd d u w l = update u (d `mappend` w) l
{-
Dijkstra's algorithm works only with non-negative edges, but it seems that there's no way to force
non-negativity check on the type level.
This implementation actually works in O(VE) time because I'm using arrays instead of more efficient
data structures. This can be improved, as everywhere else. After that, this impelementation will
work efficiently on dense graphs. To make it more efficient on sparse graph, there might be another
implementation provided: it would use Data.Heap for selecting minimum in "selectNext" function
instead of searching through list.
Another algorithm for finding shortest path from each vertex to each is Johnson's algorithm, which
is actually Dijkstra's algorithm with some special precalculation. I don't provide it here in the
demo because there's a little additional code needed: the precalculation is a call to Ford-Bellman
algorithm with fancy arguments. It works with all graphs which do not contain negative cycles.
Unfortunatelly, it seems that there's no way to force it on the type level as we need at least to
call some shortest path algorithm to check it.
-}
{- Minimum spanning tree algorithms
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
There are three efficient approaches to that problem: Prim, Kruskal and Boruvka's.
Speaking of Prim's algorithm, its code won't differ much from one for Dijkstra's algorithm: the
main difference is the way we update the array of distances (we actually update it with "w", not
"d `mappend` w". Also, we might want to construct the tree, so we might need to maintain current
result as another argument to "helper" function.
Kruskal's algorithm is based on "Disjoint Set Union" data structure. One can either implement it
himself or use an implementation from "disjoint-set" package, which is a bit weird as it claims to
use persistent maps for data structure with amortised complexity (actually, in this case I don't
believe that it works in O(log N) time, it must be O(log^2 N)). Anyway, I couldn't manage to use
this package as the code is not compilable, so I was forced to write it myself with strict maps,
which obviously does have only O(log N) overhead.
-}
{-
IMPORTANT NOTE: Alga does not seem to have any special data types for bidirectional graphs, I
suppose they are considered as a special case of directed graphs. Spanning trees are only defined
for bidirectional graphs. I could either check the graph for being bidirectional in runtime or
consider all edges in the graph being bidirectional. This implementation would work in both cases
except for it doesn't check edges on having another copy. I suppose this is enough for the demo.
-}
data DSEntry a = DSEntry { parent :: a, height :: Int }
data DisjointSetUnion a = DSUnion { dsuMap :: Data.Map.Strict.Map a (DSEntry a) }
-- Constructs a union of sets each containing exactly one element.
distinctSets :: Ord a => [a] -> DisjointSetUnion a
distinctSets = DSUnion . Data.Map.Strict.fromList . map (\x -> (x, DSEntry x 0))
-- Yes, this operation does modify the structure.
rootOfSet :: Ord a => a -> DisjointSetUnion a -> (DisjointSetUnion a, a)
rootOfSet v dsu@(DSUnion mp)
| parent (fromJust (Data.Map.Strict.lookup v mp)) == v = (dsu, v)
| Just (DSEntry u _) <- Data.Map.Strict.lookup v mp = let (newDsu, w) = rootOfSet u dsu
in (changeRoot v w newDsu, w)
where
-- changeRoot :: a -> a -> DisjointSetUnion a -> DisjointSetUnion a
changeRoot x y = DSUnion . Data.Map.Strict.insertWith insertHelper x (DSEntry y 0) . dsuMap
-- insertHelper :: DSEntry a -> DSEntry a -> DSEntry a
insertHelper e1 e2 = DSEntry (parent e1) (height e2)
inSameSets :: Ord a => a -> a -> DisjointSetUnion a -> (DisjointSetUnion a, Bool)
inSameSets x y dsu = let (dsuX, rootX) = rootOfSet x dsu;
(dsuY, rootY) = rootOfSet y dsuX
in (dsuY, rootX == rootY)
-- Unites distinct sets without checking whether they are distinct
uniteSetsUnsafe :: Ord a => a -> a -> DisjointSetUnion a -> DisjointSetUnion a
uniteSetsUnsafe x y dsu = let (dsu1, r1) = rootOfSet x dsu;
(dsu2, r2) = rootOfSet y dsu1;
h1 = height $ fromJust $ Data.Map.Strict.lookup r1 $ dsuMap dsu2;
h2 = height $ fromJust $ Data.Map.Strict.lookup r2 $ dsuMap dsu2
in helper r1 h1 r2 h2 dsu2
where
-- helper :: a -> Int -> a -> Int -> DisjointSetUnion a -> DisjointSetUnion a
helper r1 h1 r2 h2 dsu
| h1 < h2 = DSUnion $ Data.Map.Strict.insert r1 (DSEntry r2 $ max (h1 + 1) h2) $ dsuMap dsu
| otherwise = DSUnion $ Data.Map.Strict.insert r2 (DSEntry r1 $ max h1 $ h2 + 1) $ dsuMap dsu
-- Unites sets or does nothing if they are already united
uniteSets :: Ord a => a -> a -> DisjointSetUnion a -> DisjointSetUnion a
uniteSets x y dsu = let (dsu1, r1) = rootOfSet x dsu;
(dsu2, r2) = rootOfSet y dsu1
in helper r1 r2 dsu2
where
-- helper :: a -> a -> DisjointSetUnion a -> DisjointSetUnion a
helper x y
| x == y = id
| otherwise = uniteSetsUnsafe x y
minimumSpanningTreeKruskal :: (Monoid e, Ord e, Ord a) => Graph e a -> e
minimumSpanningTreeKruskal g = fst $ go (distinctSets $ vertexList g) $ sort $ edgeList g
where
-- go :: DisjointSetUnion a -> [(e, a, a)] -> (e, DisjointSetUnion a)
go d = foldl combine (mempty, d)
-- combine :: (e, DisjointSetUnion a) -> (e, a, a) -> (e, DisjointSetUnion a)
combine (c, dsu) (d, u, v) = let (dsu1, b) = inSameSets u v dsu
in if b
then (c, dsu1)
else (c `mappend` d, uniteSetsUnsafe u v dsu1)
{-
As you see, Kruskal's algorithm is pretty simple in Haskell in case there is a working DSU. This
implementation works in O(E log(V) alpha(V)) time, which is quite efficient.
I don't draft Boruvka's algorithm where as it seems there is no use in it. It is quite tricky to
implement because of some problems with uniqueness of edge weights, for example. And its efficiency
is no better then Prim's algorithm efficiency. The practical value of Boruvka's algorithm is that
it can be easily evaluated in multiple threads. As long as we are not supposed to do threading in
this draft, there's no use in trying to implement it.
-}
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment