Skip to content

Instantly share code, notes, and snippets.

@marcoheisig
Last active February 18, 2021 01:03
Show Gist options
  • Star 2 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save marcoheisig/ada00d4fd0920ec7b2c8223ca6022dc0 to your computer and use it in GitHub Desktop.
Save marcoheisig/ada00d4fd0920ec7b2c8223ca6022dc0 to your computer and use it in GitHub Desktop.
A Mechanism for Safely Maintaining Assumptions in a Common Lisp Implementation

A Mechanism for Safely Maintaining Assumptions in a Common Lisp Implementation

Description

There are many occasions where a function can be optimized under certain assumptions about the system. But since most aspects of Common Lisp can be redefined almost arbitrarily, there are few assumptions that a compiler can generally make. (A notable exception is the behavior of definitions in the CL package, and of built-in objects.)

The idea is to enhance the performance and maintainability of a Common Lisp implementation with an extensible system for introducing assumptions and for recompiling functions whose assumptions have been violated.

Currently, we envision the following assumptions:

  • A function’s definition is constant. (A prerequisite for inlining and for using a faster calling convention).
  • A function’s lambda list is constant. (A prerequisite for using a faster calling convention)
  • A function doesn’t call change the class of its Nth supplied argument. (Useful for replacing class access with rack access in the caller)
  • A function doesn’t retain references to its Nth supplied argument. (A prerequisite for stack allocation)
  • A function doesn’t mutate its Nth supplied argument. (Allows more constant propagation in the caller)
  • A function only accepts arguments of certain types.
  • A function only returns values of certain types.
  • Only a fixed set of methods is applicable to some generic function in a certain subset of its domain.
  • A class has only a fixed set of subclasses.
  • A class is never redefined. (Allows discriminating functions to work with stamps instead of classes.)

Many more assumption are conceivable.

The challenge is that we want to be able to invalidate or update any such assumption in a multi-threaded setting, without compromising correctness or safety.

The trick is that we exploit another mechanism proposed by Strandh and Raskin: Every function has two variants with identical semantics, but with different objectives. The first variant is optimized for efficient execution, while the second, slower variant is optimized for being easy to debug. Each variant has its own entry point. Usually, the slower variant is only invoked during debugging.

The new idea is to never use any assumptions when compiling the slow variant, and to have every function fall back to the slow variant when at least one of its assumptions is invalidated. More precisely, we maintain weak data structures that, for each class or function, and for each assumption thereupon, contains a list of functions that make this assumption. When the class or function is modified, we detect assumptions that will become invalid and modify every function that depends on such an assumption such that it uses only its slow variant. Afterwards, the modification can safely take place. Once the modification is over, we can generate a new fast variant of each previously slowed function and install it. This last step can be done asynchronously.

To sum up, whenever an assumption is about to be violated, the system performs the following steps:

  • Every function that depends on the assumption is ‘slowed’, i.e., its fast variant is replaced by its slow variant.
  • The assumption is violated safely.
  • (optional) New assumptions are generated.
  • Another thread starts creating new fast variants for all functions that currently have none.

Issues

Invalidation

One problem is how we achieve this ‘overwriting’ of the fast variant with the slow one. The current idea is to override the first instruction of the former with a jump to the latter. This even works in case other threads are currently executing the fast variant, as long as the compiler ensures that this first instruction is not reachable via another control flow arc.

The other question is how we overwrite an invalidated fast variant with a new fast variant, especially if the new code requires more space. It may be necessary to split the new code into several parts.

Existing References

Another problem that needs to be addressed is how to handle existing references to a fast variant. Such references can arise because the function is currently on the call stack of any thread, or because a closure has been created.

Ideally, we’d devise a mechanism that can migrate a function’s state from that of the fast variant to the slow one, and apply this mechanism to each relevant stack frame of each thread. This mechanism would solve the problem for migrating top level functions, but it doesn’t seem feasible for closures. A system could have millions of closures over the same function, and even just holding weak references to all of them could be a problem.

We see two solutions for handling closures. The first one is not to allow closures to make any assumptions. In that case, a closure could never become invalid. The other solution would be to migrate closures lazily. The latter solution is technically challenging, so we will probably go with the former at first. For the lazy migration, it may be necessary to introduce stamps for closures.

A very rewarding optimization would be to use such a migration mechanism again to turn slow variants back into fast ones once they become available.

Remarks

  • This technique would safely allow even extreme modifications such as defining an :around method on the function BINARY+.
  • This technique could also simplify and improve the creation of discriminating functions. A discriminating function can simply assume that a class’s stamp is constant, and rely on the fact that it will be recompiled on the fly whenever this is not the case.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment