jldailey/passive-models.md

## passive-models.md

      
    Raw
  

              passive-models.md
            
          
    Passive Models

This is an idea for scaling out certain data when transitioning to a highly clustered architecture.
TL;DR Don't just read, mostly subscribe.
Ideally suited for data that is read often, but written rarely; the higher the r:w ratio, the more you gain from this technique.
This tends to happen to certain data when growing up into a cluster, even if you have data that has a 2:1 ratio for a single server (a very small margin in this context, meaning it is read twice for every time it is written), when you scale it up, you often don't get a 4:2 ratio, instead you get 4:1 because the two writes end up being redundant.  That is, if you can publish knowledge of the change fast enough that other edges don't make the same change.
With many workloads, such as configuration data, you are quickly scaling at c^N:1, with real-world systems doing billions of wasted reads of data that didn't change.  Modern db systems, and even simpler caching systems like memcached, can do reads this like incredibly fast, but they still cost something, produce no value, and compete for resources with requests that really do need to read information that has changed.
So, this is an attempt to reign in this c^N:1 scaling and constrain it to N:1; one read per edge per write.
Pairing a Store with a Hub

defⁿ: Hub - any library of service that provides a publish/subscribe API.
defⁿ: Store - any lib/service that provides a CRUD API.
Clients use the Store's CRUD as any ORM would, but aggressively cache the responses in memory.  When a Client makes a change to data on the Store, they simultaneously broadcast alerts through the Hub to all other Clients.  Clients use these messages to invalidate their internal caches.  The next time that resource is requested, it's newly updated version is fetched from the Store.
Since the messages broadcast through the Hub do not cause immediate reads, this allows bursts of writes to coalesce and not cause a corresponding spike in reads, but rather the read load experienced after a change is always the same, and based on the usage pattern and how you spread traffic around your cluster.
To stick with the example of configuration data, let's suppose the usage pattern is to read the configuration on every request, and we have a cluster of web servers load balanced by a round-robin rule.  Suppose an administrative application changes and commits the configuration, it also invalidates the cached configuration on each web server through the Hub.  Each subsequent request as the round-robin proceeds around the cluster will fetch an updated configuration directly from the Store.  Load balancing rules that re-use servers, such as lowest-load, can have even higher cache efficiency.
From the perspective of the code using the Client, the writes made by others just seem to take a little bit longer to fully commit, and in exchange we never ask the database for anything until we know it has new information.
You can use any Hub, and any Store, though I have so far only built a MongoDB-based Store, and for Hubs I've used Redis, and PubNub's web-based API.  Even with PubNub's relatively long latency (about 100ms), interacting with the data feels fairly natural.  Because we are by definition dealing with data that rarely changes, the difference in user experience between a write that took 10ms and one that took 110ms, is negligible.
Further Work

The Store layer requires aggressive caching, which requires that you constrain the CRUD to things where you can hash and cache effectively.  Map/reduce is not allowed, etc., it really is best for an ORM-like scenario, where you have discrete documents, and use summary documents more than complicated queries.  Complicated queries need to pass straight through to the DB for now, so you don't gain anything (though you don't lose much either), and the user of the Client remains unaware.