Skip to content

Instantly share code, notes, and snippets.

@rvvincelli
Created June 7, 2024 15:36
Show Gist options
  • Save rvvincelli/995d09a31556d094ce8738ace4cfc66c to your computer and use it in GitHub Desktop.
Save rvvincelli/995d09a31556d094ce8738ace4cfc66c to your computer and use it in GitHub Desktop.
ssq_blog_9.1.md

Cache Implementation with Laravel

Image Soruce from <a href="https://www.freepik.com/free-vector/analytics-data-science-database-analysis-statistical-report-information-processing-automation-datacenter-expert-making-report-vector-isolated-concept-metaphor-illustration_12083365.htm#page=5&query=browser%20cache&position=22&from_view=keyword&track=ais&uuid=7141ccf8-6e8f-4f23-b85c-bbbc087ec571">freepik</a> posted by <a href="https://www.freepik.com/author/vectorjuice">vectorjuice</a>

Welcome to the first part of our two-piece blog story where we dive into the implementation of a database-based caching facility using Laravel. In this installment, we'll explore why caching is crucial, our decision to go with homemade caching, and the core elements of our caching system.

This is part one of a two-piece blog story on how we implemented a database-based caching facility relying on the Laravel abstractions. In this part:

  • why caching in the first place, and why responses?
  • existing libraries and why we did not use them
  • main abstraction: the response generator closure
  • making good use of it in the cache manager
  • invalidating stuff on write

Why Caching

Image Soruce from <a href="https://www.freepik.com/free-vector/fast-loading-concept-illustration_6184229.htm#query=fast%20browser&position=2&from_view=search&track=ais&uuid=94d7e9e5-ef4a-47da-9e32-ade4ec4ca763">Image by storyset</a> on Freepik

Caching is a vital aspect of web applications, primarily designed to boost performance. It helps eliminate the pain of slow responses and is particularly effective for websites that are mostly read-heavy. However, for interactive Software as a Service (SaaS) solutions like ours, the challenge lies in managing cache invalidation. Stale information must be avoided to ensure data accuracy.

Existing Libraries: Why We Chose Homemade Caching

When considering caching solutions, we explored two main options:

  1. Existing Libraries: We initially looked into libraries like Spatie's laravel-responsecache. While it's an excellent choice for caching GET requests, it didn't align with our specific requirements. We needed more control and flexibility over the caching mechanism, especially for non-GET requests.
  • this library is a very good solution for blanket caching on GETs, which is the typical use case for catalogues
  • they provide very good abstractions, like a hasher, but their implementation is really meant for GETs (e.g. they don't hash in body contents etc)
  • introducing it to still use very little of that (like the base classes and the etag headers) didn't feel like a good deal
  1. Homemade Caching: To meet our unique caching needs, we opted for a custom solution. Our homemade caching approach leverages Laravel's caching abstractions, uses the database as the cache store, and implements a tagging mechanism to manage cached data effectively.

So we went to homemade caching, and In this blog, we'll provide a detailed insight into our homemade caching system.

Homemade Caching

Homemade caching has three main ingredients:

  • rely as much as possible on the existing Cache abstractions by Laravel
  • use our database as cache laravel.com/docs/9.x/cache (DatabaseStore.php) so that we won't add any complexity in our architecure (e.g. redis and stuff)
  • a tagging (keying, partitioning) mechanism: be able to reap and flush all keys in a group

Say that we will show more of this later, detailing our implementation

Closure

Closures are a powerful and essential, and they play a crucial role in making your code more maintainable, reusable, and efficient. Closures are anonymous functions that can capture variables from their surrounding scope, allowing you to create self-contained units of code that can be passed around like regular values.

Closure play very important role in our homemade caching feature, this feature was behind a feature flag that if that it turned on we will use caching else we will only generate response and return, one of the issue was that we have different endpoint and each controller method have its separate logic in order to keep it consistence so that our self::generateResponse method which is storing cache based on certain condition should always accept three parameters so we can use same methods in all controllers

Key closure refresh (optional)
"$prefix-request-$requestHash" callback function() false

here is one of example, where our main self::generateResponse method accept three parameters

{
  $refreshCache = false ;
  $key = self::getCacheKeyFormodel($model, $request);
  $responseGenerator = function() use ($model) {
    // core logic here
    return response()->json(new ModelResource($model), 200);
  };
  return self::generateResponse(
    $key, // key for cache
    $responseGenerator, // logic to run
    $refresh // re-write new response
  );
}

Note that in theory, there isn't a substantial difference between placing the generation code within a closure as opposed to a method. Utilizing a closure offers the advantage of a cleaner abstraction, as it provides the caching manager with a more concise package, eliminating the need to introduce an array of methods at each endpoint we utilize.

However, it's essential to be mindful that including numerous arguments within the 'use()' statement effectively entails encapsulating a considerable amount of state. This results in a substantial closure object that accompanies the code. Such heavy objects can impact garbage collection efficiency and potentially lead to issues when attempting to serialize and transport these closures. It's worth noting that one should refrain from closing over dynamic content, such as referencing a class instance in your 'use' statement.

Core implementation

In this step we will go through our core trait for this feature which is ManageWebCache and we will see how each method is working and why we have used certain approaches, code is written is modeler way that each function contain it's related logic and can be used anywhere else.

Main flow is that when user hit an endpoint controller method ModelController@get it is going call an private lookup method which will call ManageWebCache trait method self::generateResponse that way controller also looks very clean. and that ManageWebCache@generateResponse method is going to check if feature is enabled generate response and set cache.

Step 1 - Cache Configuration

Cache is meticulously configured in our config/cache.php file, with its characteristics carefully defined. Utilizing a database-driven approach, we have designated 'database' as the cache driver. The cache_responses table serves as the repository for our cached responses, ensuring their persistence and quick retrieval. To maintain coherence with our application's architecture and prevent any conflicts, we diligently set the connection to align with our default database configuration.

return[
  'stores' => [
    'responses' => [
      'driver' => 'database',
      'table' => 'cache_responses',
      'connection' => config('database.default'),
    ],
  ],

  'cache_enabled' => env('RESPONSE_CACHE_ENABLED', false);

  'prefix' => env('CACHE_PREFIX', Str::slug(env('APP_NAME', 'laravel'), '_').'_cache'),
];

Moreover, we take particular care with the cache prefix, which allows us to distinguish our cache items.

Step 2 - Define Trait ManageWebCache

Trait has following static private properties which contains path to our config/cache.php file:

trait ManageWebCache {
  private static $driverConfigKey = 'cache.stores.responses.driver';

  private static $storeNameConfigKey = 'cache.stores.responses.table';

  private static $cachePrefixConfigKey = 'cache.prefix';

  // methods below
}

Step 3 - Generate key

An incoming request is always associated to a platform user - in particular, it can always be associated to the main resource in our domain - but it does not have to necessarily come from that specific authenticated user: the request may originate from another user in the same team, or from the system itself.

To generate key for model we have a method in trait ManageWebCache@getCacheKeyForModel which will generate key for each coming request, key is a string combination of companyId and request payload hash

protected static function getCacheKeyForModel(Model $model, Request $request): string
{
  $prefix = self::getCacheKeyPrefixForModel($model);
  $requestHash = self::getHashFor($request);
  return "$prefix-request-$requestHash"; // cache key
}

To generate prefix of key, lucky in our business domain model is always belong to a company entity so Indexing by companyId is not only handy for prefix-based invalidation, but also to make the caching scope private to the company itself. Same concept for user as well, if we would like to clear all models for specific user we can do so by following getCacheKeyPrefixForUser() method

private static function getCacheKeyPrefixFormodel(Model $model): string
{
  $companyId = $model->company_id;
  return "@company:" . $companyId . '@';
}

private static function getCacheKeyPrefixForUser(string $userId): string
{
  $companyId = self::getCompanyId($userId);
  return "@company:" . $companyId . '@';
}

private static function getCompanyId(string $userId): string
{
  $user = User::find($userId);
  $companyId = $user ? ($user->getCompanyIdAttribute() ?? 'none') : 'none';
  return $companyId;
}

As you have already noticed that we use keep server log for every critical step, so have a right key is very important to able to indentify which key belong to which route and method so request has contain and to keep each key unique we are using md5 to build half part of the request

We hash the body content and params too, because we are working with POST requests too.

The reader might argue that POST requests are not to be cached, because they create resources in the application, in classic rest terms; the thing is that we use this verb to drive expensive calculations too, where the parameters are posted in the request body; therefore, the request path and query parameters are not enough (again one of the reasons why we didn't use the spatie library).

private static function getHashFor(Request $request): string
{
  $requestUrl = self::getNormalizedRequestUrl($request);
  $method = $request->getMethod();
  $content = $request->getContent();
  $parametersString = serialize($content);
  Log::debug("
    === Incoming cacheable request === \n
    Request URL: $requestUrl \n
    Method: $method \n
    Content: $content \n
    ====== \n
  ");
  return "responsecache-requesturl:$requestUrl-method:$method-" . md5("{$requestUrl}-{$method}-{$parametersString}");
}

getNormalizedRequestUrl method is simple which gives url related content, if there is any query specially for 'GET' request we can append string starting from '?' and The URL should contain an id reference to the resource, e.g. /some/path/model/42 for model 42.

private static function getNormalizedRequestUrl(Request $request): string
{
  if ($queryString = $request->getQueryString()) {
    $queryString = '?' . $queryString;
  }
  return $request->getBaseUrl() . $request->getPathInfo() . $queryString;
}

Step 4 - Generate Response & Set Cache

Image Soruce from <a href="https://miro.medium.com/v2/resize:fit:700/0*LmFnEFVoQql3guzA.png>medium</a>

After getting correct key for cache next step is to pass data to generateResponse method which is main core method to handle all cache related logic and every controller will be using this method in order to implement cache.

Method for generating and caching responses. It checks if a response is available in the cache and, if not, generates the response, compresses it, and stores it in the cache. It also handles exceptions and logs various messages to track cache hits, misses, and errors. The final response is returned to the caller.

protected static function generateResponse(string $key, callable $generator, bool $refresh = false)
{
  $response = null;
  try {
    if (config('cache.cache_enabled')) { // check if feature is enabled - default false from config file

      if (!$refresh && self::getCache()->has($key)) { // return from cache
        Log::info("Responses cache hit for response with key '$key'");
        $responseCompressed = self::getCache()->get($key);
        $responseUncompressed = gzuncompress($responseCompressed);
        $response = response()->json(json_decode($responseUncompressed));
      } elseif ($refresh) {
        Log::info("Refreshing entry for response with key '$key'");
      } else {
        Log::info("Responses cache miss for response with key '$key'");
      }

      if (!$response) { // re-generate and set cache
        $response = $generator();
        $responseString = $response->getContent();
        $responseCompressed = gzcompress($responseString);
        if (!self::getCache()->put($key, $responseCompressed, new \DateTime('tomorrow 11:59 PM'))) {
          Log::error("Cache refuses to cache key with value $key");
        }
      }
    } else {
      $response = $generator();
      Log::info("Responses caching disabled");
    }

  } catch (\Throwable $t) {
    $message = $t->getMessage();
    Log::error("Error while hitting the cache: $message");
    $response = $generator();
  }
  return $response;
}
  1. Method Signature:

    • protected static function generateResponse(string $key, callable $generator, bool $refresh = false)

    This method takes three parameters:

    • $key (string): A unique identifier for the cached response.
    • $generator (callable): A function that generates the response if it's not found in the cache.
    • $refresh (bool, optional): A flag to force a refresh of the cache.
  2. Caching Logic: The code checks if caching is enabled based on the value of config('cacheable.cache_enabled'). If caching is enabled, it proceeds to check if the response is already in the cache (self::getCache()->has($key)).

    • If the response is found in the cache and $refresh is not true, it retrieves the cached response, decompresses it, and returns it.
    • If $refresh is true, it logs that the entry is being refreshed.
    • If the response is not in the cache or if caching is disabled, it logs a cache miss.
  3. Generating and Caching: If the response is not in the cache, or if caching is disabled, it generates the response by calling the $generator function. It then compresses the response using gzcompress and stores it in the cache using $key. The cache entry is set to expire at a specific time in the future (tomorrow at 11:59 PM).

  4. Exception Handling: The code includes exception handling. If an error occurs during the caching or response generation process, it catches the exception, logs an error message, and generates a response using the $generator function.

  5. Return Value: The method returns the generated response. If caching is enabled and a cached response is available, it returns the cached response. Otherwise, it returns the newly generated response.

In the second part of the blog, we will focus on the other half of the cache management: when to clear entries, or invalidate the cache altogether? See you then!

Blog by Riccardo Vincelli and Usama Liaquat brought to you by the engineering team at Sharesquare.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment