taylorza/GO-Fillslice.md

## GO-Fillslice.md

      
    Raw
  

              GO-Fillslice.md
            
          
    Filling an array or slice with a repeated pattern

Looking for an efficient pure GO approach to copy repeating patterns into a slice, for a toy project, I ran a few tests and discovered a neat approach to significantly improve performance. For the toy project, I am using this to fill a background buffer with a specific RGB color pattern, so improving this performance significantly improved my acheivable framerate.
All the test were run with a buffer of 73437 bytes, allocated as follows
var bigSlice = make([]byte, 73437, 73437)

Fill the slice with the value 65 by looping through each element and setting the value

for j := 0; j < len(bigSlice); j++ {
    bigSlice[j] = 65
}


Name
Executions
Time/Op
Bytes/Op
Allocs/Op


Benchmark_FillsliceIndex-4
24945
45540 ns/op
0 B/op
0 allocs/op


Fill the slice with the value 66 using a range loop to index the sliace

for j := range bigSlice {
    bigSlice[j] = 66
}


Name
Executions
Time/Op
Bytes/Op
Allocs/Op


Benchmark_FillsliceRange-4
35086
34316 ns/op
0 B/op
0 allocs/op


Fill the slice using the builtin copy function to incrementally duplicate the data through the array.

// Preload the first value into the array/slice
bigSlice[0] = 67      

// Incrementally duplicate the value into the rest of the container
for j := 1; j < len(bigSlice); j *= 2 {
    copy(bigSlice[j:], bigSlice[:j])
}


Name
Executions
Time/Op
Bytes/Op
Allocs/Op


Benchmark_FillsliceCopyTrick-4
749976
1579 ns/op
0 B/op
0 allocs/op


Example of using the copy trick to fill an array/slice with a multi-element pattern

// Define the pattern
pattern := []byte{0xde, 0xad, 0xbe, 0xef}

// Copy the pattern into the start of the container
copy(bigSlice, pattern)

// Incrementally duplicate the pattern throughout the container
for j := len(pattern); j < len(bigSlice); j *= 2 {
    copy(bigSlice[j:], bigSlice[:j])
}


Name
Executions
Time/Op
Bytes/Op
Allocs/Op


Benchmark_FillslicePatternCopyTrick-4
798944
1563 ns/op
0 B/op
0 allocs/op


Summary of all the results


Name
Executions
Time/Op
Bytes/Op
Allocs/Op


Benchmark_FillsliceIndex-4
24945
45540 ns/op
0 B/op
0 allocs/op


Benchmark_FillsliceRange-4
35086
34316 ns/op
0 B/op
0 allocs/op


Benchmark_FillsliceCopyTrick-4
749976
1579 ns/op
0 B/op
0 allocs/op


Benchmark_FillslicePatternCopyTrick-4
798944
1563 ns/op
0 B/op
0 allocs/op


How does it work?

Using the builtin copy function avoids a lot of the overhead indexing and bounds checking each element of the container. Each call to copy will copy double the amount of data into the array, further amortizing the cost for each succesive copy. In addition, and I have not looked at the actual implementation, but presumably the builtin copy function will perform block copies for the large spans.
Since the copy function will will not copy beyond the capacity of the target array/slice, the code avoids doing that bounds check for the final copy and lets the copy function handle the edge case. Since we are just copying existing data, there are no over allocations etc.
Here is a crude visualization of what is happening
['A'] seed value loaded at index 0


Current Data
Action
New Data


['A']
copy to index 0 to index 1
['A', 'A']


['A', 'A']
copy the first 2 elements to index 2
['A', 'A', 'A', 'A']


['A', 'A', 'A', 'A']
copy the first 4 elements to index 4
['A', 'A', 'A', 'A', 'A', 'A', 'A', 'A']


Each iteration doubles the amount of work carried out per call to copy.
Name	Executions	Time/Op	Allocs/Op
Benchmark_FillsliceIndex-4	24945	45540 ns/op	0 allocs/op
Benchmark_FillsliceRange-4	35086	34316 ns/op	0 allocs/op
Benchmark_FillsliceCopyTrick-4	749976	1579 ns/op	0 allocs/op
Benchmark_FillslicePatternCopyTrick-4	798944	1563 ns/op	0 allocs/op
Current Data	Action	New Data
['A']	copy to index 0 to index 1	['A', 'A']
['A', 'A']	copy the first 2 elements to index 2	['A', 'A', 'A', 'A']
['A', 'A', 'A', 'A']	copy the first 4 elements to index 4	['A', 'A', 'A', 'A', 'A', 'A', 'A', 'A']