This is a Rust library that implements the Hierarchical Navigable Small World (HNSW) algorithm for approximate nearest neighbor search. HNSW is an efficient algorithm for finding the nearest neighbors of high-dimensional vectors in a large dataset.
To use the HNSW library in your Rust project, add the following to your Cargo.toml
file:
[dependencies]
hnsw = "0.1.0"
Then, in your Rust code, you can import and use the library as follows:
use hnsw::{HnswIndex, VectorItem, EuclideanDistance};
fn main() {
// Create a new HNSW index
let hnsw = HnswIndex::new(10000, 1.0 / 3.0, 16, Box::new(EuclideanDistance));
// Add items to the index
for i in 0..1000 {
let vector: Vec<f64> = (0..10).map(|_| rand::random::<f64>() * 10.0).collect();
let item = VectorItem { id: i, vector };
hnsw.add(item).unwrap();
}
// Perform a search query
let query_vec = vec![1.0, 2.0];
let query = VectorItem { id: 9999, vector: query_vec };
match hnsw.search(&query, 5) {
Ok(results) => {
println!("Search results for query {:?}:", query.vector);
for result in results {
println!("Found item with id: {}", result.id);
}
},
Err(e) => println!("Search error: {}", e),
}
}
The HnswIndex
struct represents an HNSW index. It provides methods for adding items to the index and performing nearest neighbor searches.
new(max_elements: usize, level_lambda: f64, max_level: usize, distance_calculator: Box<dyn DistanceCalculator>) -> Self
Creates a new HnswIndex
instance.
max_elements
: The maximum number of elements the index can hold.level_lambda
: The level generation probability.max_level
: The maximum number of levels in the index.distance_calculator
: A boxed trait object implementing theDistanceCalculator
trait for calculating distances between vectors.
Adds a new item to the index.
item
: TheVectorItem
to be added.
Returns Ok(())
if the item is successfully added, or an error message if the operation fails.
Performs a nearest neighbor search for the given query vector.
query
: The queryVectorItem
.k
: The number of nearest neighbors to return.
Returns a Result
containing a vector of the top k
nearest neighbors, or an error message if the search fails.
The VectorItem
struct represents an item in the index.
id
: The unique identifier of the item.vector
: The feature vector of the item.
The DistanceCalculator
trait defines the interface for calculating distances between vectors.
Calculates the distance between two VectorItem
s.
item1
: The firstVectorItem
.item2
: The secondVectorItem
.
Returns the calculated distance as a f64
value.
The main.rs
file provides an example of how to use the HNSW library. It demonstrates creating an index, adding items to it, and performing nearest neighbor searches.
To run the example, execute the following command in the project directory:
cargo run
This will build and run the example code, displaying the search results for the test queries.
I created
/tmp/hnsw.txt
like this:I wrote the first version of that code with Claude 3 Haiku:
llm -m claude-3-haiku 'write Bash code that loops through every *.rs file in the current directory and for each one outputs its name, then the content of the file, then ----'