Skip to content

Instantly share code, notes, and snippets.

@aspiwack
Last active October 21, 2020 20:11
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save aspiwack/090ff68e0eebd599de8277659d893d86 to your computer and use it in GitHub Desktop.
Save aspiwack/090ff68e0eebd599de8277659d893d86 to your computer and use it in GitHub Desktop.
Heap: benchmarking pairing heaps vs braun trees
PKG core_bench
PKG qcheck
B _build

Benchmarking Heaps

I found out recently that Coq had a heap implementation based on Braun trees, contributed by Jean-Christophe Filliâtre. It hadn’t heard of Braun trees, so it got me curious. I found a delightful article by Chris Okasaki (which also cites plenty of previous work which I haven’t taken time to read yet).

My main interrogation was how it compared with pairing heaps an implementation of heaps (which apparently predates the Braun-tree based one by 10 years) which is famous for being both very simple and very efficient. So I got around to making measurements (and learnt along the way that doing meaningful measurements is not an obvious thing to do).

Results

It’s worth pointing out that, if the Braun-tree based implementation is rather simple, pairing heaps are incredibly easy to code (in fact, it may have taken me less time to write the pairing heap implementation from scratch, than to find the Coq implementation again on my hard drive, copy it in and adapt it for the sake of uniformity).

Also, besides the efficiency considerations, pairing heap support the merge operation which the Braun-tree based implementation does not.

I benchmarked random sequences of operations using the Core_bench micro-benchmarking tool. I ran two types of sequences of operations:

  • Sequences of (random) add followed by the same number of remove_min (ordered in the result table, followed by the number of add).
  • Random sequences of (random) add and remove_min where the proportion of remove_min is a parameter (they are indicated by the proportion of remove_min, followed by the number of operations).

In random sequences, whatever the proportion of remove_min, pairing heap are consistently faster: a significant 25% faster. The proportion of remove_min-s does not affect average performance. In the case where add-s are done first and are followed by remove_min-s, the difference is smaller but pairing heaps still come out ahead.

Memory consumption is higher for the Braun-tree implementation in both cases (over twice higher in the case of the random sequences of operation).

Results grouped by test kind (sorted by time/run below):

NameTime/RunmWd/RunmjWd/RunProm/RunPercentage
Pairing heap (ordered):503.15us1.96kw1.49w1.49w4.45%
Pairing heap (ordered):1007.24us4.79kw7.26w7.26w10.24%
Pairing heap (ordered):25021.83us14.78kw54.84w54.84w30.86%
Pairing heap (ordered):50050.83us33.67kw245.65w245.65w71.85%
Pairing heap (1/2):1004.79us1.46kw0.71w0.71w6.78%
Pairing heap (1/2):2009.81us3.16kw2.82w2.82w13.87%
Pairing heap (1/2):50025.29us8.54kw17.79w17.79w35.75%
Pairing heap (1/2):100052.41us17.84kw72.64w72.64w74.09%
Pairing heap (1/3):1004.80us1.46kw0.71w0.71w6.78%
Pairing heap (1/3):20010.00us3.16kw2.84w2.84w14.13%
Pairing heap (1/3):50025.81us8.54kw17.77w17.77w36.48%
Pairing heap (1/3):100052.74us17.84kw72.55w72.55w74.56%
Pairing heap (1/4):1004.87us1.63kw1.10w1.10w6.89%
Pairing heap (1/4):2009.92us3.51kw4.57w4.57w14.02%
Pairing heap (1/4):50025.63us9.49kw29.45w29.45w36.23%
Pairing heap (1/4):100052.37us19.79kw119.55w119.55w74.02%
Pairing heap (1/7):1004.65us1.77kw1.65w1.65w6.57%
Pairing heap (1/7):2009.85us3.82kw6.92w6.92w13.92%
Pairing heap (1/7):50024.98us10.35kw45.57w45.57w35.31%
Pairing heap (1/7):100053.33us21.65kw185.02w185.02w75.39%
Braun heap (ordered):503.42us2.66kw1.21w1.21w4.83%
Braun heap (ordered):1008.39us6.69kw5.92w5.92w11.86%
Braun heap (ordered):25025.35us21.24kw45.32w45.32w35.83%
Braun heap (ordered):50060.70us49.81kw209.43w209.43w85.80%
Braun heap (1/2):1005.81us2.28kw0.80w0.80w8.21%
Braun heap (1/2):20012.58us5.58kw3.54w3.54w17.78%
Braun heap (1/2):50033.25us17.57kw26.10w26.10w47.01%
Braun heap (1/2):100068.71us40.76kw116.67w116.67w97.13%
Braun heap (1/3):1005.63us2.28kw0.80w0.80w7.96%
Braun heap (1/3):20012.14us5.58kw3.51w3.51w17.16%
Braun heap (1/3):50032.30us17.58kw25.91w25.91w45.65%
Braun heap (1/3):100070.02us40.77kw116.73w116.73w98.97%
Braun heap (1/4):1005.76us2.57kw1.22w1.22w8.14%
Braun heap (1/4):20012.26us6.16kw5.48w5.48w17.34%
Braun heap (1/4):50033.05us18.77kw39.71w39.71w46.72%
Braun heap (1/4):100070.13us42.87kw179.13w179.13w99.14%
Braun heap (1/7):1005.70us2.77kw1.78w1.78w8.06%
Braun heap (1/7):20012.03us6.47kw7.94w7.94w17.01%
Braun heap (1/7):50032.32us19.40kw57.75w57.75w45.69%
Braun heap (1/7):100070.74us43.66kw263.82w263.82w100.00%

Results sorted by time/run:

NameTime/RunmWd/RunmjWd/RunProm/RunPercentage
Pairing heap (ordered):503.15us1.96kw1.49w1.49w4.45%
Braun heap (ordered):503.42us2.66kw1.21w1.21w4.83%
Pairing heap (1/7):1004.65us1.77kw1.65w1.65w6.57%
Pairing heap (1/2):1004.79us1.46kw0.71w0.71w6.78%
Pairing heap (1/3):1004.80us1.46kw0.71w0.71w6.78%
Pairing heap (1/4):1004.87us1.63kw1.10w1.10w6.89%
Braun heap (1/3):1005.63us2.28kw0.80w0.80w7.96%
Braun heap (1/7):1005.70us2.77kw1.78w1.78w8.06%
Braun heap (1/4):1005.76us2.57kw1.22w1.22w8.14%
Braun heap (1/2):1005.81us2.28kw0.80w0.80w8.21%
Pairing heap (ordered):1007.24us4.79kw7.26w7.26w10.24%
Braun heap (ordered):1008.39us6.69kw5.92w5.92w11.86%
Pairing heap (1/2):2009.81us3.16kw2.82w2.82w13.87%
Pairing heap (1/7):2009.85us3.82kw6.92w6.92w13.92%
Pairing heap (1/4):2009.92us3.51kw4.57w4.57w14.02%
Pairing heap (1/3):20010.00us3.16kw2.84w2.84w14.13%
Braun heap (1/7):20012.03us6.47kw7.94w7.94w17.01%
Braun heap (1/3):20012.14us5.58kw3.51w3.51w17.16%
Braun heap (1/4):20012.26us6.16kw5.48w5.48w17.34%
Braun heap (1/2):20012.58us5.58kw3.54w3.54w17.78%
Pairing heap (ordered):25021.83us14.78kw54.84w54.84w30.86%
Pairing heap (1/7):50024.98us10.35kw45.57w45.57w35.31%
Pairing heap (1/2):50025.29us8.54kw17.79w17.79w35.75%
Braun heap (ordered):25025.35us21.24kw45.32w45.32w35.83%
Pairing heap (1/4):50025.63us9.49kw29.45w29.45w36.23%
Pairing heap (1/3):50025.81us8.54kw17.77w17.77w36.48%
Braun heap (1/3):50032.30us17.58kw25.91w25.91w45.65%
Braun heap (1/7):50032.32us19.40kw57.75w57.75w45.69%
Braun heap (1/4):50033.05us18.77kw39.71w39.71w46.72%
Braun heap (1/2):50033.25us17.57kw26.10w26.10w47.01%
Pairing heap (ordered):50050.83us33.67kw245.65w245.65w71.85%
Pairing heap (1/4):100052.37us19.79kw119.55w119.55w74.02%
Pairing heap (1/2):100052.41us17.84kw72.64w72.64w74.09%
Pairing heap (1/3):100052.74us17.84kw72.55w72.55w74.56%
Pairing heap (1/7):100053.33us21.65kw185.02w185.02w75.39%
Braun heap (ordered):50060.70us49.81kw209.43w209.43w85.80%
Braun heap (1/2):100068.71us40.76kw116.67w116.67w97.13%
Braun heap (1/3):100070.02us40.77kw116.73w116.73w98.97%
Braun heap (1/4):100070.13us42.87kw179.13w179.13w99.14%
Braun heap (1/7):100070.74us43.66kw263.82w263.82w100.00%

Complexity considerations

The Braun-tree based implementation has good worst-case bounds: add and remove_min are both O(log(n)) worst-case. Pairing heaps, on the other hands, have O(1) add but O(n) remove_min in the worst-case.

In the table, we can observe that the time of a sequence of operations is roughly linear in its length, indicating that both implementations have O(1) amortised time for both add and remove_min. However, this is under the assumption that the heaps are used in a single-threaded manner. Since both implementation are purely functional, it is possible to make pairing heap very inefficient by repeating many times a slow remove_min. Braun heaps are much more resistant to this kind of attacks.

Compiling & running the benchmarks

In an environment with ocamlbuild, ocamlfind and core_bench available, run the benchmarks with:

ocamlbuild -use-ocamlfind bench.native --

In an environment with ocamlbuild, ocamlfind and qtest available, run the tests with:

ocamlbuild -use-ocamlfind test.native --

Notes about using the results in Org mode

To build insert the table into Org mode I run the tests with

ocamlbuild -use-ocamlfind heap.native -- -ascii

Then I copy it in an Org mode buffer, remove the first column because it isn’t parsed well by Org mode. Select the table, C-c |, paste the first column back, insert | in front of each line with string-insert-rectangle, make sure the horizontal-rule line starts with |- (no space), normalise the table (C-c C-c).

true: package(core_bench,oUnit,qcheck), thread
open Heap
(** [heap_ops (n,d) len] generates a random sequence of Heap
operations of size [len], [n/d] of them, on average, being removes. *)
let heap_ops (module H : Heap) (n,d) len =
let int () = Random.int (10*len) in
let op () =
if Random.int d < n then H.remove_min
else H.add (int ())
in
let rec loop h = function
| 0 -> h
| n -> loop (op () h) (n-1)
in
H.min (loop H.empty len)
(** [heap_adds len] generates a random sequence of [add] of size [len], then [len] [remove_min]. *)
let heap_adds (module H : Heap) len =
let int () = Random.int (20*len) in
let rec loop_add h = function
| 0 -> h
| n -> loop_add (H.add (int()) h) (n-1)
in
let rec loop_remove h = function
| 0 -> h
| n -> loop_remove (H.remove_min h) (n-1)
in
H.min (loop_remove (loop_add H.empty len) len)
let heap_ops_bench name rat h =
Core_bench.Std.Bench.Test.create_indexed
~name
~args:[100;200;500;1000]
(fun len -> Core.Std.Staged.stage (fun () -> heap_ops h rat len))
let heap_adds_bench name h =
Core_bench.Std.Bench.Test.create_indexed
~name
~args:[50;100;250;500]
(fun len -> Core.Std.Staged.stage (fun () -> heap_adds h len))
let () =
Core.Std.Command.run @@ Core_bench.Std.Bench.make_command [
heap_adds_bench "Pairing heap (ordered)" (module Pairing);
heap_ops_bench "Pairing heap (1/2)" (1,3) (module Pairing);
heap_ops_bench "Pairing heap (1/3)" (1,3) (module Pairing);
heap_ops_bench "Pairing heap (1/4)" (1,4) (module Pairing);
heap_ops_bench "Pairing heap (1/7)" (1,7) (module Pairing);
heap_adds_bench "Braun heap (ordered)" (module Braun);
heap_ops_bench "Braun heap (1/2)" (1,3) (module Braun);
heap_ops_bench "Braun heap (1/3)" (1,3) (module Braun);
heap_ops_bench "Braun heap (1/4)" (1,4) (module Braun);
heap_ops_bench "Braun heap (1/7)" (1,7) (module Braun);
]
module type Heap = sig
type t
val empty : t
val add : int -> t -> t
val min : t -> int option
val remove_min : t -> t
end
module Pairing : Heap = struct
type t =
| Empty
| Node of int * t list
let empty = Empty
let merge h1 h2 =
match h1,h2 with
| Empty,h | h,Empty -> h
| Node(m1,c1) , Node(m2,c2) ->
if m1 <= m2 then
Node(m1,h2::c1)
else
Node (m2,h1::c2)
let add a h =
merge (Node(a,[])) h
let min = function
| Empty -> None
| Node(a,_) -> Some a
let rec merge_pairs = function
| [] -> Empty
| [h] -> h
| h1::h2::rest -> merge (merge h1 h2) (merge_pairs rest)
let remove_min = function
| Empty -> Empty
| Node(a,c) -> merge_pairs c
end
module Braun = struct
type t =
| Empty
| Node of t * int * t
let empty = Empty
let rec add a = function
| Empty -> Node(Empty,a,Empty)
| Node(l,b,r) ->
if a < b then
Node(add b r,a,l)
else
Node(add a r,b,l)
let min = function
| Empty -> None
| Node(_,a,_) -> Some a
let is_below a = function
| Empty -> true
| Node (_, b, _) -> a <= b
let rec replace_min a = function
| Node (l, _, r) when is_below a l && is_below a r ->
Node (l, a, r)
| Node ((Node (_, la, _) as l), _, r) when is_below la r ->
(* la <= a, ra necessarily *)
Node (replace_min a l, la, r)
| Node (l, _, (Node (_, ra, _) as r)) ->
(* ra <= a, la necessarily *)
Node (l, ra, replace_min a r)
| Empty | Node (Empty, _, _) | Node (_, _, Empty) ->
assert false
let rec extract = function
| Empty ->
assert false
| Node (Empty, b, r) ->
(* assert (r = Empty); *)
b, Empty
| Node (l, b, r) ->
let a, l = extract l in
a, Node (r, b, l)
(* merges two Braun trees [l] and [r],
with the assumption that [size r <= size l <= size r + 1] *)
let rec merge l r =
match l,r with
| l , Empty -> l
| Empty , r -> assert false (* hypothesis *)
| Node (ll, la, lr), Node (_, ra, _) ->
if la <= ra then
Node(r , la , merge ll lr)
else
let (a',l') = extract l in
Node (replace_min a' r , ra , l')
let remove_min = function
| Empty -> Empty
| Node(l,a,r) -> merge l r
end
open Heap
type op =
| Add of int
| Remove
let op_pp = function
| Add x -> Printf.sprintf "add(%d)" x
| Remove -> Printf.sprintf "remove"
let gen_ops n =
let open QCheck.Gen in
let gen_op =
frequency
[ 2, return (fun x -> Add x) <*> small_int
; 1, return Remove
]
in
list_size (0--n) gen_op
let arb_ops n : op list QCheck.arbitrary =
let shrink_op o =
let open QCheck.Iter in
match o with
| Add x ->
return (fun x -> Add x) <*> QCheck.Shrink.int x
| Remove -> empty
in
let shrink =
QCheck.Shrink.list ~shrink:shrink_op in
let print = QCheck.Print.list op_pp in
QCheck.make ~shrink ~print (gen_ops n)
let heap_ops (module H : Heap) ops =
let rec heap_ops h = function
| [] -> h
| Add x :: r -> heap_ops (H.add x h) r
| Remove :: r -> heap_ops (H.remove_min h) r
in
H.min (heap_ops H.empty ops)
module TBraun = struct
open Braun
let rec size = function
| Empty -> 0
| Node (l,_,r) -> 1+size l+size r
let size_invariant = function
| Empty -> true
| Node (l,_,r) ->
let sl = size l and sr = size r in
sl = sr || sl = sr+1
let rec heap_invariant = function
| Empty -> true
| Node (l,x,r) -> is_below x l && is_below x r && heap_invariant l && heap_invariant r
(* Duplicate because of the way first-class module interact with typechecking *)
let heap_ops ops test =
let rec heap_ops h = function
| [] -> h
| Add x :: r -> heap_ops (add x h) r
| Remove :: r -> heap_ops (remove_min h) r
in
test (heap_ops empty ops)
end
let () =
QCheck_runner.run_tests_main QCheck.Test.[
make ~name:"Pairing & Braun agree" (arb_ops 300) (fun ops -> heap_ops (module Pairing) ops = heap_ops (module Braun) ops);
make ~name:"Braun heaps are Braun trees" (arb_ops 300) (fun ops -> TBraun.(heap_ops ops size_invariant));
make ~name:"Braun heaps are min heaps" (arb_ops 300) (fun ops -> TBraun.(heap_ops ops size_invariant));
]
@backtracking
Copy link

Thanks @aspiwack for this quite interesting benchmark!

For the record, this is because of Okasaki's paper that I opted for the Braun tree based version. At that time, I knew already several other ways of implementing priority queues (e.g. binomial heaps, leftist heap, skew heaps, and pairing heaps, at least). But Braun trees are my favorite.

Note: The truth is that, most of the time, you don't need priority queues to be applicative. If so, a binary heap stored inside a (resizable) array is to be preferred, as is much more efficient. It is a pity that the literature is full of cute applicative heap data structures, but that we don't need them in practice. But may be that's the case for this specific application in Coq...

@aspiwack
Copy link
Author

I had forgotten I'd even done this benchmark 🙂 . Glad you liked it (even though I'm now even more suspicious of benchmark results as I was then).

My favourite heap implementation is still pairing heap, by the way, just because I can remember it.

I imagine a binary heap stored inside an array would be absurdly fast because of the good cache behaviour. But even so, people seem to reach for (mutable) linked structures such as Fibonacci heap more often than not. I don't know enough to venture a guess as to why it is so, though.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment