Welcome to the first part of our two-piece blog story where we dive into the implementation of a database-based caching facility using Laravel. In this installment, we'll explore why caching is crucial, our decision to go with homemade caching, and the core elements of our caching system.
This is part one of a two-piece blog story on how we implemented a database-based caching facility relying on the Laravel abstractions. In this part:
- why caching in the first place, and why responses?
- existing libraries and why we did not use them
- main abstraction: the response generator closure
- making good use of it in the cache manager
- invalidating stuff on write
Caching is a vital aspect of web applications, primarily designed to boost performance. It helps eliminate the pain of slow responses and is particularly effective for websites that are mostly read-heavy. However, for interactive Software as a Service (SaaS) solutions like ours, the challenge lies in managing cache invalidation. Stale information must be avoided to ensure data accuracy.
When considering caching solutions, we explored two main options:
- Existing Libraries: We initially looked into libraries like Spatie's laravel-responsecache. While it's an excellent choice for caching GET requests, it didn't align with our specific requirements. We needed more control and flexibility over the caching mechanism, especially for non-GET requests.
- this library is a very good solution for blanket caching on GETs, which is the typical use case for catalogues
- they provide very good abstractions, like a hasher, but their implementation is really meant for GETs (e.g. they don't hash in body contents etc)
- introducing it to still use very little of that (like the base classes and the etag headers) didn't feel like a good deal
- Homemade Caching: To meet our unique caching needs, we opted for a custom solution. Our homemade caching approach leverages Laravel's caching abstractions, uses the database as the cache store, and implements a tagging mechanism to manage cached data effectively.
So we went to homemade caching, and In this blog, we'll provide a detailed insight into our homemade caching system.
Homemade caching has three main ingredients:
- rely as much as possible on the existing Cache abstractions by Laravel
- use our database as cache laravel.com/docs/9.x/cache (
DatabaseStore.php
) so that we won't add any complexity in our architecure (e.g. redis and stuff) - a tagging (keying, partitioning) mechanism: be able to reap and flush all keys in a group
Say that we will show more of this later, detailing our implementation
Closures are a powerful and essential, and they play a crucial role in making your code more maintainable, reusable, and efficient. Closures are anonymous functions that can capture variables from their surrounding scope, allowing you to create self-contained units of code that can be passed around like regular values.
Closure play very important role in our homemade caching feature, this feature was behind a feature flag that if that it turned on we will use caching else we will only generate response and return, one of the issue was that we have different endpoint and each controller method have its separate logic
in order to keep it consistence so that our self::generateResponse
method which is storing cache based on certain condition should always accept three parameters so we can use same methods in all controllers
Key | closure | refresh (optional) |
---|---|---|
"$prefix-request-$requestHash" |
callback function() |
false |
here is one of example, where our main self::generateResponse
method accept three parameters
{
$refreshCache = false ;
$key = self::getCacheKeyFormodel($model, $request);
$responseGenerator = function() use ($model) {
// core logic here
return response()->json(new ModelResource($model), 200);
};
return self::generateResponse(
$key, // key for cache
$responseGenerator, // logic to run
$refresh // re-write new response
);
}
Note that in theory, there isn't a substantial difference between placing the generation code within a closure as opposed to a method. Utilizing a closure offers the advantage of a cleaner abstraction, as it provides the caching manager with a more concise package, eliminating the need to introduce an array of methods at each endpoint we utilize.
However, it's essential to be mindful that including numerous arguments within the 'use()' statement effectively entails encapsulating a considerable amount of state. This results in a substantial closure object that accompanies the code. Such heavy objects can impact garbage collection efficiency and potentially lead to issues when attempting to serialize and transport these closures. It's worth noting that one should refrain from closing over dynamic content, such as referencing a class instance in your 'use' statement.
In this step we will go through our core trait for this feature which is ManageWebCache
and we will see how each method is working and why we have used certain approaches, code is written is modeler way that each function contain it's related logic and can be used anywhere else.
Main flow is that when user hit an endpoint controller method ModelController@get
it is going call an private lookup method which will call ManageWebCache
trait method self::generateResponse
that way controller also looks very clean. and that ManageWebCache@generateResponse
method is going to check if feature is enabled generate response and set cache.
Cache is meticulously configured in our config/cache.php
file, with its characteristics carefully defined. Utilizing a database-driven approach, we have designated 'database' as the cache driver. The cache_responses
table serves as the repository for our cached responses, ensuring their persistence and quick retrieval. To maintain coherence with our application's architecture and prevent any conflicts, we diligently set the connection to align with our default database configuration.
return[
'stores' => [
'responses' => [
'driver' => 'database',
'table' => 'cache_responses',
'connection' => config('database.default'),
],
],
'cache_enabled' => env('RESPONSE_CACHE_ENABLED', false);
'prefix' => env('CACHE_PREFIX', Str::slug(env('APP_NAME', 'laravel'), '_').'_cache'),
];
Moreover, we take particular care with the cache prefix, which allows us to distinguish our cache items.
Trait has following static private properties which contains path to our config/cache.php
file:
trait ManageWebCache {
private static $driverConfigKey = 'cache.stores.responses.driver';
private static $storeNameConfigKey = 'cache.stores.responses.table';
private static $cachePrefixConfigKey = 'cache.prefix';
// methods below
}
An incoming request is always associated to a platform user - in particular, it can always be associated to the main resource in our domain - but it does not have to necessarily come from that specific authenticated user: the request may originate from another user in the same team, or from the system itself.
To generate key for model we have a method in trait ManageWebCache@getCacheKeyForModel
which will generate key for each coming request, key is a string combination of companyId and request payload hash
protected static function getCacheKeyForModel(Model $model, Request $request): string
{
$prefix = self::getCacheKeyPrefixForModel($model);
$requestHash = self::getHashFor($request);
return "$prefix-request-$requestHash"; // cache key
}
To generate prefix of key, lucky in our business domain model is always belong to a company entity so Indexing by companyId is not only handy for prefix-based invalidation, but also to make the caching scope private to the company itself. Same concept for user as well, if we would like to clear all models for specific user we can do so by following getCacheKeyPrefixForUser()
method
private static function getCacheKeyPrefixFormodel(Model $model): string
{
$companyId = $model->company_id;
return "@company:" . $companyId . '@';
}
private static function getCacheKeyPrefixForUser(string $userId): string
{
$companyId = self::getCompanyId($userId);
return "@company:" . $companyId . '@';
}
private static function getCompanyId(string $userId): string
{
$user = User::find($userId);
$companyId = $user ? ($user->getCompanyIdAttribute() ?? 'none') : 'none';
return $companyId;
}
As you have already noticed that we use keep server log for every critical step, so have a right key is very important to able to indentify which key belong to which route and method so request has contain and to keep each key unique we are using md5 to build half part of the request
We hash the body content and params too, because we are working with POST requests too.
The reader might argue that POST requests are not to be cached, because they create resources in the application, in classic rest terms; the thing is that we use this verb to drive expensive calculations too, where the parameters are posted in the request body; therefore, the request path and query parameters are not enough (again one of the reasons why we didn't use the spatie library).
private static function getHashFor(Request $request): string
{
$requestUrl = self::getNormalizedRequestUrl($request);
$method = $request->getMethod();
$content = $request->getContent();
$parametersString = serialize($content);
Log::debug("
=== Incoming cacheable request === \n
Request URL: $requestUrl \n
Method: $method \n
Content: $content \n
====== \n
");
return "responsecache-requesturl:$requestUrl-method:$method-" . md5("{$requestUrl}-{$method}-{$parametersString}");
}
getNormalizedRequestUrl
method is simple which gives url related content, if there is any query specially for 'GET' request we can append string starting from '?' and The URL should contain an id reference to the resource, e.g. /some/path/model/42 for model 42.
private static function getNormalizedRequestUrl(Request $request): string
{
if ($queryString = $request->getQueryString()) {
$queryString = '?' . $queryString;
}
return $request->getBaseUrl() . $request->getPathInfo() . $queryString;
}
After getting correct key for cache next step is to pass data to generateResponse
method which is main core method to handle all cache related logic and every controller will be using this method in order to implement cache.
Method for generating and caching responses. It checks if a response is available in the cache and, if not, generates the response, compresses it, and stores it in the cache. It also handles exceptions and logs various messages to track cache hits, misses, and errors. The final response is returned to the caller.
protected static function generateResponse(string $key, callable $generator, bool $refresh = false)
{
$response = null;
try {
if (config('cache.cache_enabled')) { // check if feature is enabled - default false from config file
if (!$refresh && self::getCache()->has($key)) { // return from cache
Log::info("Responses cache hit for response with key '$key'");
$responseCompressed = self::getCache()->get($key);
$responseUncompressed = gzuncompress($responseCompressed);
$response = response()->json(json_decode($responseUncompressed));
} elseif ($refresh) {
Log::info("Refreshing entry for response with key '$key'");
} else {
Log::info("Responses cache miss for response with key '$key'");
}
if (!$response) { // re-generate and set cache
$response = $generator();
$responseString = $response->getContent();
$responseCompressed = gzcompress($responseString);
if (!self::getCache()->put($key, $responseCompressed, new \DateTime('tomorrow 11:59 PM'))) {
Log::error("Cache refuses to cache key with value $key");
}
}
} else {
$response = $generator();
Log::info("Responses caching disabled");
}
} catch (\Throwable $t) {
$message = $t->getMessage();
Log::error("Error while hitting the cache: $message");
$response = $generator();
}
return $response;
}
-
Method Signature:
protected static function generateResponse(string $key, callable $generator, bool $refresh = false)
This method takes three parameters:
$key
(string): A unique identifier for the cached response.$generator
(callable): A function that generates the response if it's not found in the cache.$refresh
(bool, optional): A flag to force a refresh of the cache.
-
Caching Logic: The code checks if caching is enabled based on the value of
config('cacheable.cache_enabled')
. If caching is enabled, it proceeds to check if the response is already in the cache (self::getCache()->has($key)
).- If the response is found in the cache and
$refresh
is not true, it retrieves the cached response, decompresses it, and returns it. - If
$refresh
is true, it logs that the entry is being refreshed. - If the response is not in the cache or if caching is disabled, it logs a cache miss.
- If the response is found in the cache and
-
Generating and Caching: If the response is not in the cache, or if caching is disabled, it generates the response by calling the
$generator
function. It then compresses the response usinggzcompress
and stores it in the cache using$key
. The cache entry is set to expire at a specific time in the future (tomorrow at 11:59 PM). -
Exception Handling: The code includes exception handling. If an error occurs during the caching or response generation process, it catches the exception, logs an error message, and generates a response using the
$generator
function. -
Return Value: The method returns the generated response. If caching is enabled and a cached response is available, it returns the cached response. Otherwise, it returns the newly generated response.
In the second part of the blog, we will focus on the other half of the cache management: when to clear entries, or invalidate the cache altogether? See you then!
Blog by Riccardo Vincelli and Usama Liaquat brought to you by the engineering team at Sharesquare.