gnuvince/binsearch.org

## binsearch.org

      
    Raw
  

              binsearch.org
            
          
    Introduction

In the fourth chapter of “Programming Pearls”, Jon Bentley discusses
  program correctness and tells us that as part of some of his
  programming classes, he asks the attendees to implement the binary
  search algorithm.  Although simple in appearance (“look at the middle
  element, if it’s the target terminate, if it’s smaller look in the
  upper half, otherwise look in the lower half”), it can be a
  surprisingly tricky algorithm to implement.  He cites TAoCP Vol. 3
  wherein Don Knuth mentions that although the first paper on binary
  search was published in 1946, it wasn’t until 1962 that a bug-free
  version was presented.
In this literate org file, I’ll attempt to implement binary search in
  Rust, and hopefully, get it right.
Plan

My favorite way of implementing binary search is with recursion; the
  recursive algorithm works fine in a language with tail-call
  elimination, however, this optimization is not available in Rust, and
  thus is not how I’d implement the algorithm if I wanted to put it in
  production.  Therefore, I’ll use a loop to implement the algorithm
  instead.
Our function will have the following signature:
fn binsearch<T: PartialOrd>(target: &T, collection: &[T]) -> Option<usize>;
Our function is parametrized over all types T that are partial
  orderable (e.g., all integer types, floating number types, strings,
  chars, etc.)  We used the word “array” before to talk about the
  collection being searched, however we’ll be more general and use a
  slice in our Rust program; therefore the function works with arrays,
  but also vectors and other types of collections.
Traditionally, the return value of binary search is a signed integer;
  the slice is indexed from 0 to $N-1$, and the special return value -1
  is used to denote the absence of the target.  In Rust, slices are
  indexed with the type usize, an unsigned, machine-sized integer, so
  -1 is really 2⁶⁴-1 (on a 64-bit machine), which is a valid slice
  index.  We could return an isize value, but instead, the return type
  of our function shall be Option<usize>: the value Some(i) is
  returned if the target is found at index i and None if the target
  is absent.  This pattern is also more consistent with idiomatic Rust
  and more robust (we cannot forget to check the return value before
  attempting to use it).
Tests

Rust allows unit tests to be written quite easily by marking a
  function with the #[test] attribute.  We can then use the command
  rustc --test or cargo test to build and execute those tests.
Let us start by writing some unit tests where the target is part of
  the array.
#[test]
fn test_present() {
    // i32
    assert_eq!(Some(0), binsearch(&0_i32, &[0, 1, 2, 4, 8, 16]));
    assert_eq!(Some(1), binsearch(&1_i32, &[0, 1, 2, 4, 8, 16]));
    assert_eq!(Some(2), binsearch(&2_i32, &[0, 1, 2, 4, 8, 16]));
    assert_eq!(Some(3), binsearch(&4_i32, &[0, 1, 2, 4, 8, 16]));
    assert_eq!(Some(4), binsearch(&8_i32, &[0, 1, 2, 4, 8, 16]));
    assert_eq!(Some(5), binsearch(&16_i32, &[0, 1, 2, 4, 8, 16]));

    // char
    assert_eq!(Some(0), binsearch(&'a', &['a', 'b', 'c']));
    assert_eq!(Some(1), binsearch(&'b', &['a', 'b', 'c']));
    assert_eq!(Some(2), binsearch(&'c', &['a', 'b', 'c']));

    // vector
    assert_eq!(Some(0), binsearch(&0.0, &vec![0.0, 1.0, 2.0]));
    assert_eq!(Some(1), binsearch(&1.0, &vec![0.0, 1.0, 2.0]));
    assert_eq!(Some(2), binsearch(&2.0, &vec![0.0, 1.0, 2.0]));
}
And some cases where the target is not part of the slice.  We’ll
  include the edge case of an empty slice here.
#[test]
fn test_absent() {
    assert_eq!(None, binsearch(&0, &[]));
    assert_eq!(None, binsearch(&0, &[1]));
    assert_eq!(None, binsearch(&2, &[1]));
    assert_eq!(None, binsearch(&0, &[1, 3]));
    assert_eq!(None, binsearch(&2, &[1, 3]));
    assert_eq!(None, binsearch(&4, &[1, 3]));

    assert_eq!(None, binsearch(&3.14, &vec![]));
    assert_eq!(None, binsearch(&3.14, &vec![0.0, 1.0, 2.0]));
}
Implementation

Our implementation first needs to establish the bounds of the
  sub-slice that we will be examining.  Initially, that interval is $[0,
  N)$ (where $N$ is the number of elements in the slice).
If at any point the lower bound of the interval equals or exceeds the
  upper bound, we break and we know we haven’t found our target; the
  sub-slice delimited by those bounds is empty.
The computation of the mid-point uses a small trick to avoid an
  integer overflow.  The following equations show that this is
  equivalent to computing the mid-point naively.
If the target is found, we perform an early return, otherwise we
  update the interval’s bounds: if the target is contained in the lower
  half of the slice, we modify the bounds to be $[lo, m)$, otherwise the
  new bounds are $[m+1, hi)$.
fn binsearch<T: PartialOrd>(target: &T, collection: &[T]) -> Option<usize> {
    let mut lo: usize = 0;
    let mut hi: usize = collection.len();

    while lo < hi {
        let m: usize = (hi - lo) / 2 + lo;

        if *target == collection[m] {
            return Some(m);
        } else if *target < collection[m] {
            hi = m;
        } else {
            lo = m+1;
        }
    }
    return None;
}