Swift is a language used in good part on mobile devices, some of which operate with a (very) limited amount of computational power — think of older iPhones or Apple Watches, for instance.
In such contexts, every CPU cycle counts and well-crafted apps will be expected to make the most of them. A known way to reduce an app footprint is through caching.
Caching is often considered a tricky business to implement. While this is true when external resources — such as web pages — are being put in cache, things are much easier when we consider only caching the result of pure functions.
A function is considered pure if it always returns the same value for a given input, while producing no side-effect along the way.
For such functions, functional programming is the designated tool to transparently bring them cache-efficiency.
Consider the following code:
https://gist.github.com/db173183776be2efc4fcd983b4be14b2
The cached
function takes a function as argument, and returns a new function, that will compute the exact same result as the original one. But once a value has been computed a first time, it will then be stored in a cache.
Consequently, any subsequent calls made with the same input will no longer require any computation to be carried out in order to bring out the correct result.
Consider that you are writing code for a 2D game, you are now able to easily work with cache-efficient versions of standard trigonometry functions:
https://gist.github.com/8bafc32cf717ab635be5be9dba21ca1e
If your games frequently makes trigonometry computations for the same values, there is now doubt that such an approach will bear significant improvement in performances 💪
The technique discussed above is so easy to use that one might think that there is a catch to it! And indeed there is one, or rather a limitation.
When you work with non-recursive functions, all is well and the cached
function works like a charm.
But if you try to pass it a recursive function, for instance:
https://gist.github.com/b005a1ab9fb75ee3a2cc0e79aea82463
You will notice no boost of performance, and indeed for values of n
greater than 20, the function will still take a lot of time to return – remember that, without caching intermediary results, this function has an exponential time complexity.
Why does this happen? Well, fib
is a recursive function, so it calls itself within its own body.
Indeed, when you put fib
through the cached
function and call the resulting function, there will be a cache look-up. But after that, the recursive calls to fib
will still point towards the original fib
, therefore no cache look-up will happen anymore.
In order to allow the same cache mechanism to work with recursive functions, additional work will be required.
First of all, we won't be able to create cached versions of existing functions. But we can provide a framework to easily write cache-efficient recursive functions.
The key problem we need to solve is that of the recursive call. We need a way to transparently wrap some code around the recursive call, in order to perform the cache look-up before actually performing it.
The solution lies in the following type signature : ((In) -> Out, In) -> Out
– In
and Out
being generic types. This signature describes a function that takes two argument :
- the recursive function
- the input of the recursive function
Through this type, we are able to both make recursive calls, using the first argument, and access the input of the function, via the second argument.
From there we only need to write an internal function that actually performs the cache look-up before, if necessary, carrying through the recursive call.
https://gist.github.com/27d9da91b6a5425de5d94328946b2024
While this implementation is indeed a little bit complex, the great thing is that using is extremely easy, as additional syntax is kept to a bear minimum:
https://gist.github.com/9022d69c913ee512b96386aa0d284e22
From there you can now use the cached function, and see that it runs much faster!
https://gist.github.com/1ad51f7f68f54929c3aac7e85a402180
The process I've described above, that consist of caching the results of intermediary results is called Memoization. Its principle is rather simple: it trades memory for CPU cycles.
While it can bring some noticeable speedup in computations, as with any kind of compromise, it's no silver bullet. Thus it's very important to consider every specific situations before applying it:
- in some cases, you might need to put a cap on the size of the cache, and implement some cache replacement policy in order to manage the cache.
- when working with floating numbers, caching might reveal inefficient because the inputs will often differ by very small amount. In such cases rounding the input before the computation might help.
- etc.
Thanks for reading this article, if you've enjoyed its content, make sure to check my GitHub repo dedicated to tips for the Swift language!