Skip to content

Instantly share code, notes, and snippets.

@Metaxal
Created October 16, 2022 17:16
Show Gist options
  • Save Metaxal/97a9f582b0c84c8aa1c4aebeb77dc88e to your computer and use it in GitHub Desktop.
Save Metaxal/97a9f582b0c84c8aa1c4aebeb77dc88e to your computer and use it in GitHub Desktop.
Comparison between hash-ref on hasheq (for fixnums) and a custom binary search
#lang racket
(require racket/fixnum
math/base)
;;; Conclusion: hasheq-ref is (slightly) faster than a custom binary search
;;; (It's actually probably implemented as a binary search, with some more optimization,
;;; as the bin search is within a factor 2)
(#%declare #:unsafe)
(define (make-random-fxvector N)
(apply fxvector
(sort (build-list N (λ _ (random-natural (fxquotient (most-positive-fixnum) 2))))
fx<)))
;; Warning: The addition could overflow!
(define (fxvector<-find vec x)
(let ([x (fx+ x)] [end (fxvector-length vec)])
(let loop ([start 0] [idx (fxrshift end 1)] [end end])
(cond [(fx= start end)
#f]
[else
(define y (fxvector-ref vec idx))
(cond [(fx< x y)
(define new-idx (fxrshift (fx+ start idx) 1))
(loop (fx+ start) (fx+ new-idx) (fx+ idx))]
[(fx> x y)
(define new-idx (fxrshift (fx+ idx 1 end) 1))
(loop (fx+ idx 1) (fx+ new-idx) (fx+ end))]
[else (fx+ idx)])]))))
(module+ test
(require rackunit)
(define vec (make-random-fxvector 10))
(for ([i (in-range (fxvector-length vec))])
(define x (fxvector-ref vec i))
(check = i (fxvector<-find vec x)))
;; Look for numbers that aren't there
(for ([x (in-fxvector vec)] [y (in-fxvector vec 1)])
(unless (< y (+ x 2))
(check-false (fxvector<-find vec (fxquotient (+ x y) 2))))))
(module+ main
(define N 10000000 #;10000000)
(define h (make-hasheq))
(displayln "build random vec")
(define vec (time (make-random-fxvector N)))
(displayln "vec -> hash")
(time
(for ([x (in-fxvector vec)] [i (in-naturals)])
(hash-set! h x i)))
(displayln "build random queries")
(define N-queries 1000000)
(for ([exists? '(#t #f)])
(if exists?
(displayln "Random existing queries")
(displayln "Random queries"))
(define queries
(time
(if exists?
;; existing queries
(for/fxvector #:length N-queries ([i (in-range N-queries)])
(fxvector-ref vec (random N)))
;; likely absent queries
(for/fxvector #:length N-queries ([i (in-range N-queries)])
(random-natural (most-positive-fixnum))))))
(collect-garbage)
(collect-garbage)
(displayln "vec")
(time
(for ([x (in-fxvector queries)])
(fxvector<-find vec x)))
(collect-garbage)
(displayln "hash")
(time
(for ([x (in-fxvector queries)])
(hash-ref h x #f)))))
@Metaxal
Copy link
Author

Metaxal commented Oct 16, 2022

Results:

$ racket hasheq-ref-vs-bin-search.rkt
build random vec
cpu time: 12984 real time: 12997 gc time: 1247
vec -> hash
cpu time: 10444 real time: 10454 gc time: 7225
build random queries
Random existing queries
cpu time: 204 real time: 204 gc time: 0
vec
cpu time: 436 real time: 436 gc time: 0
hash
cpu time: 187 real time: 188 gc time: 0
Random queries
cpu time: 906 real time: 907 gc time: 8
vec
cpu time: 242 real time: 242 gc time: 0
hash
cpu time: 167 real time: 167 gc time: 0

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment