Skip to content

Instantly share code, notes, and snippets.

@AlanCoding
Last active May 10, 2022 17:54
Show Gist options
  • Save AlanCoding/3bff66c63db36ae7cb8c326df2c38e29 to your computer and use it in GitHub Desktop.
Save AlanCoding/3bff66c63db36ae7cb8c326df2c38e29 to your computer and use it in GitHub Desktop.

Notes on AWX settings cache

Issue: ansible/awx#5765

The current work is making a change so that we will use our pre-existing TTL cache in all services, not just the callback receiver.

Settings wrapper intro

Our configure-tower-in-tower settings modifies the normal Django settings. If you inspect settings, you find it is the normal Django LazySettings type.

Things get weird when you realize that we overrode the __getattr__ of those settings. Thus, we store our own settings-like object in settings._wrapped, which is a SettingsWrapper.

Settings caches

So how do I get the cache? You can import it.

from django.core.cache import cache as django_cache

This will give you the Redis cache, and you can do things like .clear(), get, and set it.

This is the cache that our settings stuff uses. Verify with settings._wrapped.cache.cache is django_cache, which is True. Let's unpack that a bit.

The object settings._wrapped.cache is the cache that our main settings class references. This isn't the real settings cache, because it is another layer on top to handle encrypted settings. In practice, there are relatively few encrypted settings, but those that we have are a little messy, because encryption is seeded with the setting primary key value, which also has to be cached (in Redis). So for now, you may ignore this layer.

We have another cache which is the in-memory cache, which can be found at settings._awx_conf_memoizedcache. This does not use the same syntax as the Django / Redis cache. Instead, it is dictionary-like. So you can do settings._awx_conf_memoizedcache['foo'] = 'bar', although in practice we use more complex keys.

Key differences between caches

Consider the Redis cache at settings._wrapped.cache.cache

  • timeout for the key-value store is typically 60 seconds
  • this is shared among all processes on the same node
  • avoids spamming the database

Consider the in-memory cache settings._awx_conf_memoizedcache

  • timeout is 5 seconds (see changes from this PR)
  • only available for a given process
  • has a threading lock to avoid race conditions as of recent fix
  • avoids egriously duplicate hits to the Redis cache over loops and stuff

An example of a loop that references the same setting multiple times is this:

https://github.com/ansible/awx/blob/ac6a82eee41feb041ff3e4d16459d4b1a774175f/awx/main/access.py#L423

            if not settings.MANAGE_ORGANIZATION_AUTH and isinstance(obj, (Team, User)):
                user_capabilities[display_method] = self.user.is_superuser
                continue

This can be hit multiple times for every object in a list view, which means it would keep requesting MANAGE_ORGANIZATION_AUTH many times.

Key changes to in-memory cache

ansible/awx#12166

The change driving this is that we apply the 5 second TTL to all services, not just the callback receiver. This reduces the number of cache calls in ordinary dispatcher code and web requests. Some numbers:

number of cache calls to /api/v2/unified_jobs/

  • after change
    • 60
    • 10
    • 10
    • 342 preloads the cache
  • before change
    • 157
    • 107
    • 77
    • 78
    • 410 preloads the cache

Even in the worst case scenario, these still tend to take not much more than 0.1 seconds cumulatively.

However, it does allow us to limit down to 10 calls per request in some cases, which is an order of magnitude reduction. The number of 10 reflects about the number of meaningful configure-tower-in-tower settings we need to actually produce a response.

The other larger numbers can be driven down to ~10 in my oppinion, if we fixed other inefficiencies, particularly in the encryption logic.

Because we use the in-memory cache in uWSGI, we have to add more cases to clear the in-memory cache, which are mainly:

  • when a request starts, clear the cache to avoid inconsistencies (change a setting, reload the page in 2 seconds)
  • when a setting is created, deleted, or changed, we have to reload to avoid issues with the tower URL setting, which is referenced in field logic from dependent settings - also to avoid inconsistencies
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment