AlanCoding/cache_stuff.md

## cache_stuff.md

      
    Raw
  

              cache_stuff.md
            
          
    Notes on AWX settings cache

Issue: ansible/awx#5765
The current work is making a change so that we will use our pre-existing TTL cache in all services, not just the callback receiver.
Settings wrapper intro

Our configure-tower-in-tower settings modifies the normal Django settings. If you inspect settings, you find it is the normal Django LazySettings type.
Things get weird when you realize that we overrode the __getattr__ of those settings. Thus, we store our own settings-like object in settings._wrapped, which is a SettingsWrapper.
Settings caches

So how do I get the cache? You can import it.
from django.core.cache import cache as django_cache
This will give you the Redis cache, and you can do things like .clear(), get, and set it.
This is the cache that our settings stuff uses. Verify with settings._wrapped.cache.cache is django_cache, which is True. Let's unpack that a bit.
The object settings._wrapped.cache is the cache that our main settings class references. This isn't the real settings cache, because it is another layer on top to handle encrypted settings. In practice, there are relatively few encrypted settings, but those that we have are a little messy, because encryption is seeded with the setting primary key value, which also has to be cached (in Redis). So for now, you may ignore this layer.
We have another cache which is the in-memory cache, which can be found at settings._awx_conf_memoizedcache. This does not use the same syntax as the Django / Redis cache. Instead, it is dictionary-like. So you can do settings._awx_conf_memoizedcache['foo'] = 'bar', although in practice we use more complex keys.
Key differences between caches

Consider the Redis cache at settings._wrapped.cache.cache

timeout for the key-value store is typically 60 seconds
this is shared among all processes on the same node
avoids spamming the database

Consider the in-memory cache settings._awx_conf_memoizedcache

timeout is 5 seconds (see changes from this PR)
only available for a given process
has a threading lock to avoid race conditions as of recent fix
avoids egriously duplicate hits to the Redis cache over loops and stuff

An example of a loop that references the same setting multiple times is this:
https://github.com/ansible/awx/blob/ac6a82eee41feb041ff3e4d16459d4b1a774175f/awx/main/access.py#L423
            if not settings.MANAGE_ORGANIZATION_AUTH and isinstance(obj, (Team, User)):
                user_capabilities[display_method] = self.user.is_superuser
                continue
This can be hit multiple times for every object in a list view, which means it would keep requesting MANAGE_ORGANIZATION_AUTH many times.
Key changes to in-memory cache

ansible/awx#12166
The change driving this is that we apply the 5 second TTL to all services, not just the callback receiver.
This reduces the number of cache calls in ordinary dispatcher code and web requests. Some numbers:
number of cache calls to /api/v2/unified_jobs/

after change

60
10
10
342		preloads the cache


before change

157
107
77
78
410		preloads the cache


Even in the worst case scenario, these still tend to take not much more than 0.1 seconds cumulatively.
However, it does allow us to limit down to 10 calls per request in some cases, which is an order of magnitude reduction.
The number of 10 reflects about the number of meaningful configure-tower-in-tower settings we need to actually produce a response.
The other larger numbers can be driven down to ~10 in my oppinion, if we fixed other inefficiencies, particularly in the encryption logic.
Because we use the in-memory cache in uWSGI, we have to add more cases to clear the in-memory cache, which are mainly:

when a request starts, clear the cache to avoid inconsistencies (change a setting, reload the page in 2 seconds)
when a setting is created, deleted, or changed, we have to reload to avoid issues with the tower URL setting, which is referenced in field logic from dependent settings - also to avoid inconsistencies