A practical guide for anyone building search over personal knowledge bases, agent memory, or private document corpora. Focuses on the vocabulary gap problem, the case for hybrid retrieval, and a decision framework for choosing the right architecture. Numbers are from production benchmarks across 68 engines on 2,000+ queries over real personal knowledge bases — not synthetic data, not Wikipedia.
If you've built or used a search-based memory system and wondered why it keeps returning wrong or empty results, the answer is almost always one of two things: