Why NOT to Cache
I’ve worked over the years on a lot of projects, with many teams. Frontend, backend, mobile, and so forth. One topic that always comes up - caching. Developers love to talk about the topic, and are excited to add seemingly low-cost performance enhancers to their architecture and code base. However, as Martin Fowler and many others have pointed out, caching is evil. It’s one of the hardest problems in computer science to solve.
The typical pattern is to take a slow request - say, an API response - and store it in a local cache (perhaps to disk, or Redis). The implementation goes like this: the simple approach is to do a get/set cache lookup. When the user needs the data, check if it’s in the cache. If it’s not, or it has expired, then fetch the latest value and store it in the cache.
For apps, one side effect of this approach is degraded experience for some small group of users. For example, a web app needs to access a slow resource that takes 2 seconds to respond. With caching used, that request time goes down to milliseconds. Using a small expiration time, suppose the cache gets cleared after an average of 100 requests during the day. Perhaps at night, this number might go down to every 10 requests. So effectively, the app seems fast, on average. But some small percentage of requests will be painfully slow. We therefore decide to give a random user a terrible experience, for the benefit of the rest.
Another common problem is invalidation. Typically, this results when content is updated. Even though a title or description is changed, the results don’t show up right away. This is the cache invalidation problem. The problem seems simple - when a record is updated, clear the cache for that item and refresh it. However, due to data dependencies, it’s not always clear which records are affected. For example, a title may appear in multiple places, or have invisible side-effects in other places. As a result, the situation quickly spirals out of control, leading to increasing complexity in data design and caching. And developers get the dreaded “did you clear the cache?” question on a regular basis.
Because caching is easy to implement, developers are often quick to optimize, prematurely. The use of a cache might give confidence, make tests run faster, and look good for development progress. Once in production, things get more complicated. It should be an absolute last resort.
Once a cache layer has been introduced into a system, additional dependencies are also created. The network connection between service and cache. The keys and possible collisions. Synchronizing multiple services or data centers is also a challenge. Caches use resources that might become exhausted. The cache layers, like applications and services, must be monitored and maintained, and cleared.
I would encourage people to look a bit farther outside, and avoid standard get/set cache approaches in most cases. There are some other alternatives:
- Can you make the dependent service faster? Why is it slow in the first place?
- Is there another way to look up the data more quickly (search indexes?)
- Can cache be avoided by a different pattern, or by moving data ownership?
- Can the requests be done in an asynchronous way, to provide updates as they arrive?
- Is the same resource being requested too often by code? Can this be reduced?
If performance cannot be increased, due to limitations with services out of our control, then some guidelines about caching:
- Pre-populate the cache - on deploy, or using a worker.
- Build an index of content or values needed and use this, with updates triggered on any change
- Move cache infrastructure close to the service owner - the higher the cache level in the architecture, the more work to invalidate
- It’s usually better that the service implements cache, rather than the client
- Develop strong cache keys, based on content and time, to ensure uniqueness and simplify expirations
- Consider using the design approach of Varnish - return the cached version and check for invalidation after the request
Any playform that uses cache should be able to run completely without it. A system that depends on a warmed up cache is a fragile and worrysome design. Developers should be able to clear entire caches without fear - knowing that the responsible services will recover, and the overall system will remain stable.
Never implement cache without measurements. Do benchmarks and sanity-checks on the performance of services, and be sure that your numbers are real. Too often, I’ve seen situations where a “slow” service was really due to WIFI problems, over-loaded development servers, or poorly indexed databases. It’s even the case that production systems can magically get faster when caches are removed or disabled. Avoid premature optimization. Code that depends on cache hides the problems of performance, and these problems should always be visible. Develop without caches whenever possible.
Instead of caching - think about data management. What data do you need that’s not accessible quickly enough in real time? How can that data be gathered in a simple way, and kept up to date? Look for a different design, one that scales and doesn't add dependencies.
The best system design is one that can work in real-time, using no cache. Fast data accesses, de-coupled business logic, and able to scale horizontally for more performance.