This file contains the benchmark results for different iteration models proposed for inclusion in the OCaml's standard library.
- Machine: Intel(R) Xeon(R) CPU E3-1245 V2 @ 3.40GHz (x86_64)
- Compiler: 4.03.0 (-O3 -unbox-closures -unbox-closures-factor 20)
name | description | simplified type |
---|---|---|
gen |
imperative-style generator | unit -> 'a option |
seq |
push-based iterator | ('a -> unit) -> unit |
ulist |
thunk-protected linked list | unit -> [Nil|Cons of 'a * 'a t] |
uncons |
pure iterators with explicit state | 's * ('s -> ('a * 's) option) |
The following chart shows the normalized execution rate of each iterator compared to other implementations. Higher values are better. Tests were performed with different input sizes: 100, 1K, 10K, 100K and 1M.
(Source: https://docs.google.com/spreadsheets/d/1QFaG...)
- 100:
Throughputs for "gen", "ulist", "uncons", "sequence" each running 2 times for at least 3 CPU seconds:
gen: 3.12 WALL ( 3.12 usr + 0.00 sys = 3.12 CPU) @ 77415.60/s (n=241227)
3.11 WALL ( 3.11 usr + 0.00 sys = 3.11 CPU) @ 77614.86/s (n=241227)
ulist: 3.14 WALL ( 3.14 usr + 0.00 sys = 3.14 CPU) @ 47090.24/s (n=147675)
3.17 WALL ( 3.17 usr + 0.00 sys = 3.17 CPU) @ 46614.58/s (n=147675)
uncons: 3.20 WALL ( 3.19 usr + 0.00 sys = 3.20 CPU) @ 55889.24/s (n=178622)
3.19 WALL ( 3.19 usr + 0.00 sys = 3.19 CPU) @ 55959.27/s (n=178622)
sequence: 3.14 WALL ( 3.14 usr + 0.00 sys = 3.14 CPU) @ 106652.55/s (n=334889)
3.16 WALL ( 3.15 usr + 0.00 sys = 3.16 CPU) @ 106111.85/s (n=334889)
Rate ulist uncons gen sequence
ulist 46852+-226/s -- -16% -40% -56%
uncons 55924+- 33/s 19% -- -28% -47%
gen 77515+- 95/s 65% 39% -- -27%
sequence 106382+-257/s 127% 90% 37% --
- 1,000:
Throughputs for "gen", "ulist", "uncons", "sequence" each running 2 times for at least 3 CPU seconds:
gen: 3.14 WALL ( 3.14 usr + 0.00 sys = 3.14 CPU) @ 7745.55/s (n=24352)
3.14 WALL ( 3.14 usr + 0.00 sys = 3.14 CPU) @ 7745.55/s (n=24352)
ulist: 3.16 WALL ( 3.16 usr + 0.00 sys = 3.16 CPU) @ 4709.49/s (n=14882)
3.17 WALL ( 3.17 usr + 0.00 sys = 3.17 CPU) @ 4697.60/s (n=14882)
uncons: 3.17 WALL ( 3.17 usr + 0.00 sys = 3.17 CPU) @ 5631.46/s (n=17863)
3.17 WALL ( 3.16 usr + 0.00 sys = 3.16 CPU) @ 5645.70/s (n=17863)
sequence: 3.16 WALL ( 3.16 usr + 0.00 sys = 3.16 CPU) @ 10731.65/s (n=33912)
3.17 WALL ( 3.17 usr + 0.00 sys = 3.17 CPU) @ 10704.55/s (n=33912)
Rate ulist uncons gen sequence
ulist 4704+- 6/s -- -17% -39% -56%
uncons 5639+- 7/s 20% -- -27% -47%
gen 7746+- 0/s 65% 37% -- -28%
sequence 10718+-13/s 128% 90% 38% --
- 10,000:
Throughputs for "gen", "ulist", "uncons", "sequence" each running 2 times for at least 3 CPU seconds:
gen: 3.16 WALL ( 3.16 usr + 0.00 sys = 3.16 CPU) @ 772.78/s (n=2442)
3.15 WALL ( 3.15 usr + 0.00 sys = 3.15 CPU) @ 775.73/s (n=2442)
ulist: 3.17 WALL ( 3.16 usr + 0.00 sys = 3.16 CPU) @ 472.19/s (n=1494)
3.18 WALL ( 3.18 usr + 0.00 sys = 3.18 CPU) @ 470.40/s (n=1494)
uncons: 3.16 WALL ( 3.16 usr + 0.00 sys = 3.16 CPU) @ 563.84/s (n=1784)
3.16 WALL ( 3.16 usr + 0.00 sys = 3.16 CPU) @ 563.84/s (n=1784)
sequence: 3.19 WALL ( 3.19 usr + 0.00 sys = 3.19 CPU) @ 1063.99/s (n=3392)
3.19 WALL ( 3.19 usr + 0.00 sys = 3.19 CPU) @ 1062.66/s (n=3392)
Rate ulist uncons gen sequence
ulist 471+-1/s -- -16% -39% -56%
uncons 564+-0/s 20% -- -27% -47%
gen 774+-1/s 64% 37% -- -27%
sequence 1063+-1/s 126% 89% 37% --
- 100,000:
Throughputs for "gen", "ulist", "uncons", "sequence" each running 2 times for at least 3 CPU seconds:
gen: 3.13 WALL ( 3.13 usr + 0.00 sys = 3.13 CPU) @ 77.37/s (n=242)
3.12 WALL ( 3.12 usr + 0.00 sys = 3.12 CPU) @ 77.46/s (n=242)
ulist: 3.13 WALL ( 3.12 usr + 0.00 sys = 3.12 CPU) @ 47.06/s (n=147)
3.20 WALL ( 3.20 usr + 0.00 sys = 3.20 CPU) @ 45.99/s (n=147)
uncons: 3.18 WALL ( 3.18 usr + 0.00 sys = 3.18 CPU) @ 56.29/s (n=179)
3.17 WALL ( 3.16 usr + 0.00 sys = 3.16 CPU) @ 56.57/s (n=179)
sequence: 3.16 WALL ( 3.16 usr + 0.00 sys = 3.16 CPU) @ 98.73/s (n=312)
3.17 WALL ( 3.17 usr + 0.00 sys = 3.17 CPU) @ 98.48/s (n=312)
Rate ulist uncons gen sequence
ulist 46.5+-0.5/s -- -18% -40% -53%
uncons 56.4+-0.1/s 21% -- -27% -43%
gen 77.4+-0.0/s 66% 37% -- -21%
sequence 98.6+-0.1/s 112% 75% 27% --
- 1,000,000:
Throughputs for "gen", "ulist", "uncons", "sequence" each running 2 times for at least 3 CPU seconds:
gen: 3.08 WALL ( 3.01 usr + 0.07 sys = 3.08 CPU) @ 7.48/s (n=23)
3.09 WALL ( 3.00 usr + 0.09 sys = 3.09 CPU) @ 7.44/s (n=23)
ulist: 3.08 WALL ( 3.00 usr + 0.08 sys = 3.08 CPU) @ 4.55/s (n=14)
3.09 WALL ( 3.00 usr + 0.09 sys = 3.09 CPU) @ 4.53/s (n=14)
uncons: 3.14 WALL ( 3.02 usr + 0.12 sys = 3.14 CPU) @ 5.42/s (n=17)
3.15 WALL ( 3.04 usr + 0.12 sys = 3.15 CPU) @ 5.39/s (n=17)
sequence: 3.06 WALL ( 2.97 usr + 0.09 sys = 3.06 CPU) @ 8.84/s (n=27)
3.01 WALL ( 2.93 usr + 0.08 sys = 3.01 CPU) @ 8.96/s (n=27)
Rate ulist uncons gen sequence
ulist 4.54+-0.01/s -- -16% -39% -49%
uncons 5.41+-0.01/s 19% -- -27% -39%
gen 7.46+-0.02/s 64% 38% -- -16%
sequence 8.90+-0.06/s 96% 65% 19% --