Skip to content

Instantly share code, notes, and snippets.

@xeoncross
Created June 22, 2022 02:13
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save xeoncross/2119cc70d272aea0ee0a83e830ea51a6 to your computer and use it in GitHub Desktop.
Save xeoncross/2119cc70d272aea0ee0a83e830ea51a6 to your computer and use it in GitHub Desktop.
Encoding a sorted sequence of integers using delta encoding can save 40%-60% on storage costs: https://go.dev/play/p/KdLslro7n1n
package iiexperiments
import (
"fmt"
"testing"
)
// How much space will using a delta encoding save?
// If we assume we have 1 byte (0-255) to tell us how many bytes the delta is contained in, we can then only store the diff between the last and next int values. This difference is the "delta".
// Additional space could be saved by having deltas span byte boundaries such as only using 3 bits (000-111) to represent up to 8 bytes (int64). This means you could use 3 bits + 5 bits left over to get started storing the delta bits.
func TestDeltaEncodingSavings(t *testing.T) {
results := []int{}
// max := 1<<63 - 1 // 9223372036854775807
max := 1<<32 - 1
// Run multiple simulations to test this
for _, value := range []int{5, 50, 500, 5000, 500000, 5000000} {
r1 := 0
r2 := 0
var last int // int64
for i := 1; i < max; i++ {
// Regular encoding
r1 += 8
// Delta Encoding
r2 += requiredBits(i-last) + 1 // (1 byte length prefix)
last = i
// similate random docID's (but sorted)
// i += rand.Intn(100000 / (j + 1)) // slowly growing skip size
i += value
// i += i / 8
}
// Update results
results = append(results, r1)
results = append(results, r2)
}
// Report results
for i := 0; i < len(results); i += 2 {
fmt.Printf("RAW: %10d\n", results[i])
fmt.Printf("ENC: %10d (%.0f%% savings)\n\n", results[i+1], 100-float32(results[i+1])/float32(results[i])*100)
}
}
func requiredBits(n int) int {
if n > (1<<32 - 1) {
return 8 // uint64
}
if n > (1<<16 - 1) {
return 4 // uint32
}
if n > (1<<8 - 1) {
return 2 // uint16
}
return 1 // uint8
}