marcoheisig/assumptions.org

## assumptions.org

      
    Raw
  

              assumptions.org
            
          
    A Mechanism for Safely Maintaining Assumptions in a Common Lisp Implementation

Description

There are many occasions where a function can be optimized under certain
  assumptions about the system.  But since most aspects of Common Lisp can be
  redefined almost arbitrarily, there are few assumptions that a compiler can
  generally make.  (A notable exception is the behavior of definitions in the
  CL package, and of built-in objects.)
The idea is to enhance the performance and maintainability of a Common Lisp
  implementation with an extensible system for introducing assumptions and
  for recompiling functions whose assumptions have been violated.
Currently, we envision the following assumptions:

  A function’s definition is constant. (A prerequisite for inlining and for
    using a faster calling convention).
  A function’s lambda list is constant. (A prerequisite for using a faster
    calling convention)
  A function doesn’t call change the class of its Nth supplied argument.
    (Useful for replacing class access with rack access in the caller)
  A function doesn’t retain references to its Nth supplied argument.  (A
    prerequisite for stack allocation)
  A function doesn’t mutate its Nth supplied argument.  (Allows more
    constant propagation in the caller)
  A function only accepts arguments of certain types.
  A function only returns values of certain types.
  Only a fixed set of methods is applicable to some generic function in a certain
    subset of its domain.
  A class has only a fixed set of subclasses.
  A class is never redefined.  (Allows discriminating functions to work
    with stamps instead of classes.)

Many more assumption are conceivable.
The challenge is that we want to be able to invalidate or update any such
  assumption in a multi-threaded setting, without compromising correctness or
  safety.
The trick is that we exploit another mechanism proposed by Strandh and
  Raskin: Every function has two variants with identical semantics, but with
  different objectives.  The first variant is optimized for efficient
  execution, while the second, slower variant is optimized for being easy to
  debug.  Each variant has its own entry point.  Usually, the slower variant
  is only invoked during debugging.
The new idea is to never use any assumptions when compiling the slow
  variant, and to have every function fall back to the slow variant when at
  least one of its assumptions is invalidated.  More precisely, we maintain
  weak data structures that, for each class or function, and for each
  assumption thereupon, contains a list of functions that make this
  assumption.  When the class or function is modified, we detect assumptions
  that will become invalid and modify every function that depends on such an
  assumption such that it uses only its slow variant.  Afterwards, the
  modification can safely take place.  Once the modification is over, we can
  generate a new fast variant of each previously slowed function and install
  it.  This last step can be done asynchronously.
To sum up, whenever an assumption is about to be violated, the system
  performs the following steps:

  Every function that depends on the assumption is ‘slowed’, i.e., its fast
    variant is replaced by its slow variant.
  The assumption is violated safely.
  (optional) New assumptions are generated.
  Another thread starts creating new fast variants for all functions that
    currently have none.

Issues

Invalidation

One problem is how we achieve this ‘overwriting’ of the fast variant with
  the slow one.  The current idea is to override the first instruction of the
  former with a jump to the latter.  This even works in case other threads
  are currently executing the fast variant, as long as the compiler ensures
  that this first instruction is not reachable via another control flow arc.
The other question is how we overwrite an invalidated fast variant with a
  new fast variant, especially if the new code requires more space.  It may
  be necessary to split the new code into several parts.
Existing References

Another problem that needs to be addressed is how to handle existing
  references to a fast variant.  Such references can arise because the
  function is currently on the call stack of any thread, or because a closure
  has been created.
Ideally, we’d devise a mechanism that can migrate a function’s state from
  that of the fast variant to the slow one, and apply this mechanism to each
  relevant stack frame of each thread.  This mechanism would solve the
  problem for migrating top level functions, but it doesn’t seem feasible for
  closures.  A system could have millions of closures over the same function,
  and even just holding weak references to all of them could be a problem.
We see two solutions for handling closures.  The first one is not to allow
  closures to make any assumptions.  In that case, a closure could never
  become invalid.  The other solution would be to migrate closures lazily.
  The latter solution is technically challenging, so we will probably go with
  the former at first.  For the lazy migration, it may be necessary to
  introduce stamps for closures.
A very rewarding optimization would be to use such a migration mechanism
  again to turn slow variants back into fast ones once they become available.
Remarks


  This technique would safely allow even extreme modifications such as
    defining an :around method on the function BINARY+.
  This technique could also simplify and improve the creation of
    discriminating functions.  A discriminating function can simply assume
    that a class’s stamp is constant, and rely on the fact that it will be
    recompiled on the fly whenever this is not the case.