rklaehn/wikipedia.md

## wikipedia.md

      
    Raw
  

              wikipedia.md
            
          
    Scenario


Moderate size dataset (300GB)
Too large to be stored entirely on end user hardware
Seeder is not fast enough to serve all clients
All users have small part of the dataset, but none have all
User on consumer hadware want to browse with low latency

This scenario is mostly about content discovery, but it is a hard scenario that the hypercore team had issues with.
I don't think content discovery and content retrieval can be completely separated while staying efficient. Ideally you want to use the same format for the answers of content discovery to ask for content. Hypercore is doing this with a compressed bitmap.