Skip to content

Instantly share code, notes, and snippets.

@karlseguin
Created January 16, 2015 12:37
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save karlseguin/a6b441269d82690381e5 to your computer and use it in GitHub Desktop.
Save karlseguin/a6b441269d82690381e5 to your computer and use it in GitHub Desktop.
specialized int set
/*
Playing with building a specialized set for integers. Specifically:
- Integers are natually randomly distributed (auto-incrementing ids for millions of records would do)
- The number of items is known upfront (can be approximate)
- The set sizes changes infrequently (we're mostly interested in membership tests)
Accomodating constraints, to be sure, but if we can meet the first two requirements,
we'll end up with a little less than 1/2 the memory of a map[int]struct{}, and membership
tests taking about ~1/2 the time.
Deletion isn't supported, but would be trivial to implement (it would use the same index function
used by exists and insert).
As the fill rate drops, performance stays good, but memory consumption becomes relatively worse (vs the built-in map)).
Insert allocates a lot of bytes in exchange for having a small final size. Nevertheless, inserting performs about
the same as inserting into a pre-allocated map, and much better than a dynamically allocated map.
*/
type Stupid struct {
mask int
buckets [][]int
}
func New(size int) *Stupid {
count := upTwo(size / 32);
s := &Stupid{
mask: count-1,
buckets: make([][]int, count),
}
return s
}
func (s *Stupid) Set(value int) {
index := value & s.mask
bucket := s.buckets[index]
l := len(bucket)
if l == 0 {
s.buckets[index] = []int{value}
return
}
position, exists := s.index(value, bucket)
if exists {
return
}
arr := make([]int, l+1)
copy(arr, bucket[:position])
arr[position] = value
copy(arr[position+1:], bucket[position:])
s.buckets[index] = arr
}
func (s *Stupid) Exists(value int) bool {
bucket := s.buckets[value & s.mask]
_, exists := s.index(value, bucket)
return exists
}
func (s *Stupid) index(value int, bucket []int) (int, bool) {
l := len(bucket)
for i := 0; i < l; i++ {
v := bucket[i]
if v == value {
return i, true
}
if v > value {
return i, false
}
}
return l, false
}
// http://graphics.stanford.edu/~seander/bithacks.html#RoundUpPowerOf2
func upTwo(v int) int {
v--;
v |= v >> 1
v |= v >> 2
v |= v >> 4
v |= v >> 8
v |= v >> 16
v++
return v
}
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment