Skip to content

Instantly share code, notes, and snippets.

@ranma42
Created November 26, 2013 18:10
Show Gist options
  • Save ranma42/7663095 to your computer and use it in GitHub Desktop.
Save ranma42/7663095 to your computer and use it in GitHub Desktop.
namespace InlineTest
module MyArray =
let inline fold<'T,'State> (f : 'State -> 'T -> 'State) (acc: 'State) (array:'T[]) =
let mutable state = acc
let len = array.Length
for i = 0 to len - 1 do
state <- f state array.[i]
state
let inline sumBy (f: ^U -> _) array =
let acc = LanguagePrimitives.GenericZero< ^U >
let s x y = f y |> Checked.(+) x
fold s acc array
let inline sum array = sumBy id array
module MySeq =
let foldAdapt<'T,'State> (f : 'State -> 'T -> 'State) (acc: 'State) (source:seq<'T>) =
use e = source.GetEnumerator()
let f = OptimizedClosures.FSharpFunc<_,_,_>.Adapt(f)
let mutable state = acc
while e.MoveNext() do
state <- f.Invoke(state, e.Current)
state
let inline foldInline<'T,'State> (f : 'State -> 'T -> 'State) (acc: 'State) (source:seq<'T>) =
use e = source.GetEnumerator()
let mutable state = acc
while e.MoveNext() do
state <- f state e.Current
state
module Test =
type MyType() =
static member Naive(x:_[]) =
let mutable r = LanguagePrimitives.GenericZero<_>
for v in x do
r <- Checked.(+) r v
r
static member SeqFold(x:_[]) =
Seq.fold Checked.(+) LanguagePrimitives.GenericZero<_> x
static member SeqSum(x:_[]) =
Seq.sum x
static member ArrayFold(x:_[]) =
Array.fold Checked.(+) LanguagePrimitives.GenericZero<_> x
static member ArraySum(x:_[]) =
Array.sum x
static member MySeqFoldAdapt(x:_[]) =
MySeq.foldAdapt Checked.(+) LanguagePrimitives.GenericZero<_> x
static member MySeqFoldInline(x:_[]) =
MySeq.foldInline Checked.(+) LanguagePrimitives.GenericZero<_> x
static member MyArrayFold(x:_[]) =
MyArray.fold Checked.(+) LanguagePrimitives.GenericZero<_> x
static member MyArraySum(x:_[]) =
MyArray.sum x
let inline bench f arg =
let t = System.Diagnostics.Stopwatch.StartNew()
f arg |> ignore
t.Elapsed.TotalMilliseconds
[<EntryPoint>]
let main args =
let x = Array.create 100000000 1
let elapsed = Array.create 9 0.0
for i = 1 to 10 do
elapsed.[0] <- bench MyType.Naive x
elapsed.[1] <- bench MyType.SeqFold x
elapsed.[2] <- bench MyType.SeqSum x
elapsed.[3] <- bench MyType.ArrayFold x
elapsed.[4] <- bench MyType.ArraySum x
elapsed.[5] <- bench MyType.MySeqFoldAdapt x
elapsed.[6] <- bench MyType.MySeqFoldInline x
elapsed.[7] <- bench MyType.MyArrayFold x
elapsed.[8] <- bench MyType.MyArraySum x
for t in elapsed do
printf "%7.2f " t
printf "-> "
let c = 1.0 / elapsed.[0]
for t in Array.map ((*) c) elapsed do
printf "%7.2f " t
printfn ""
0
@ranma42
Copy link
Author

ranma42 commented Nov 26, 2013

This micro-benchmark compares the time needed to compute the sum of an array in several ways.
On MacOSX 10.9 running it in Mono version 3.2.5 ((detached/25a3e0e Tue Nov 26 14:29:09 CET 2013) it results in about this performance:

  • Naive 1.0 (reference)
  • SeqFold 14.0
  • SeqSum 9.5
  • ArrayFold 2.0
  • ArraySum 1.0
  • MySeqFoldAdapt 10.0
  • MySeqFoldInline 9.5
  • MyArrayFold 1.0
  • MyArraySum 1.0

or:

  • 1.0 Naive, ArraySum, MyArrayFold, MyArraySum
  • 2.0 ArrayFold
  • 9.5 SeqSum, MySeqFoldInline
  • 10.0 MySeqFoldAdapt
  • 14.0 SeqFold

Some observations:

  • the speed of ArrayFold can be easily brought to the optimal value simply by making it inline (which also allows to rewrite sumBy and sum as folds with no loss of performance)
  • Seq.fold can be improved by adapting its first argument (which improves the performance without causing the function to be inlined)
  • Seq.fold can be further improved by inlining it; as expected this results in getting the same performance as Seq.sum, but since this function has a much more complex body than Array.fold, it might be desirable to avoid inlining it in order to keep assemblies smaller
  • iterating over a sequence has a very high cost (at least on the current Mono runtime)
  • the ratios are pretty high because a simple (checked) sum was performed; more complex operations would hide some of the overhead, but simpler operations (example: unchecked sum) would make it even bigger

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment