Skip to content

Instantly share code, notes, and snippets.

@nichm
nichm / retrieval-for-memory-systems-public.md
Created April 14, 2026 18:28
Retrieval for Memory Systems: BM25 vs Vector Search — When Each Fails

Retrieval for Memory Systems: BM25 vs Vector Search — When Each Fails

A practical guide for anyone building search over personal knowledge bases, agent memory, or private document corpora. Focuses on the vocabulary gap problem, the case for hybrid retrieval, and a decision framework for choosing the right architecture. Numbers are from production benchmarks across 68 engines on 2,000+ queries over real personal knowledge bases — not synthetic data, not Wikipedia.


The Short Version (Before Any Numbers)

If you've built or used a search-based memory system and wondered why it keeps returning wrong or empty results, the answer is almost always one of two things: