Skip to content

Instantly share code, notes, and snippets.

@Deleplace
Created September 8, 2022 12:51
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save Deleplace/0544f0681912348666ee48e2d55bc58d to your computer and use it in GitHub Desktop.
Save Deleplace/0544f0681912348666ee48e2d55bc58d to your computer and use it in GitHub Desktop.
Benchmark heavy use of slices.Insert vs. heavy use of a dynamic-array-like insert func
package insert
import (
"fmt"
"testing"
"golang.org/x/exp/slices"
)
func BenchmarkGrow(b *testing.B) {
for _, M := range []int{
10,
1_000,
100_000,
} {
{
name := fmt.Sprintf("insert_%d", M)
b.Run(name, func(b *testing.B) {
for i := 0; i < b.N; i++ {
var a []int
for j := 0; j < M; j++ {
a = slices.Insert(a, 0, 0)
}
Sink = a
}
})
}
{
name := fmt.Sprintf("insertDynamic_%d", M)
b.Run(name, func(b *testing.B) {
for i := 0; i < b.N; i++ {
var a []int
for j := 0; j < M; j++ {
a = insertDynamic(a, 0, 0)
}
Sink = a
}
})
}
}
}
// insertDynamic is like slices.Insert, except it allocates much less
// frequently, by allocating 25% extra memory whenever it needs to grow s.
func insertDynamic[S ~[]E, E any](s S, i int, v ...E) S {
tot := len(s) + len(v)
if tot <= cap(s) {
s2 := s[:tot]
copy(s2[i+len(v):], s[i:])
copy(s2[i:], v)
return s2
}
return append(v, s...)
}
// Prevent optimizing things away
var Sink []int
@Deleplace
Copy link
Author

Results on my workstation:

% go test -bench=.
goos: darwin
goarch: amd64
pkg: insert
cpu: VirtualApple @ 2.50GHz
BenchmarkGrow/insert_10-10         	 3999915	       294.9 ns/op
BenchmarkGrow/insertDynamic_10-10  	 4214118	       283.6 ns/op
BenchmarkGrow/insert_1000-10       	    1603	    757460 ns/op
BenchmarkGrow/insertDynamic_1000-10         	    5380	    215887 ns/op
BenchmarkGrow/insert_100000-10              	       1	4795886125 ns/op
BenchmarkGrow/insertDynamic_100000-10       	       1	1830154000 ns/op

Both funcs are O(M^2), when inserting M elements one by one in a loop. However, insertDynamic tends to be 3x as fast because it allocates less often, and memory allocation tends to dominate the cost of slices.Insert.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment