Skip to content

Instantly share code, notes, and snippets.

@paniq
Last active May 7, 2019 08:03
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 1 You must be signed in to fork a gist
  • Save paniq/8c39baea49692dcda34b7f031eeaddfe to your computer and use it in GitHub Desktop.
Save paniq/8c39baea49692dcda34b7f031eeaddfe to your computer and use it in GitHub Desktop.

Programmable Sidebands for Scopes

by Leonard Ritter, Duangle GbR leonard.ritter@duangle.com, 2019-05-06

During my work on Scopes' borrow checker I noticed a pattern that generalizes to any mechanism that simulates runtime dataflow at compile time. It is potentially possible to implement this mechanism in a generic way so that users can add their own sidebands, and if that proves too byzantine, we have at least a nice internal universal interface that makes sense of such augmentations.

Strictly speaking, type checking itself is already a simulation of runtime dataflow; through types we are primarily following data formats, but also references, keyed arguments, range constraints. Types form a veritable jungle of runtime information that we wish to capture during compilation.

At the moment that we add qualifiers to types we lose simplicity. This is the point where I wonder whether we shouldn't learn a bit from data oriented programming and strive to create a new sideband for each expression attribute we wish to qualify. But in order to do that, we need to know how qualifiers generalize in Scopes.

One restless night later, I believe I may have an inkling of which parts can be generalized.

Data Model

After typechecking a function, each computed Value within the function is mapped to a type. Likewise, we keep separate mappings, one for each sideband we are interested in, represented by a qualifier type. That's right, we do not wrap our main types in qualifiers, we keep them in separate maps. That way, we are able to transform qualifiers without continuously having to rewrap the underlying type.

I call these attributes qualifiers but they can be much more than simple tags. Each qualifier is a record of one or multiple attributes which are able to mutate as the function is typed.

We could keep one Value -> Qualifier map per function, but we also need to capture diverging state. Take borrow checking for instance: the move state of a unique value changes as we enter and exit branches of the function. Therefore we map each Block within a function to a qualifier map, allowing each basic block to see a different reality of the execution flow.

When the function is typed, we need to export our qualifiers to callers, as well as be able to let the callers give us initial qualifiers through function parameters. Therefore the function itself needs, in addition to its main signature, a qualifier signature for each sideband. We cache and reuse functions by their type signatures, and now also by their qualifier signatures, namely the parameter types and signatures passed to the function. We want to avoid retyping the same function multiple times just because its parameter qualifiers fluctuate slightly. Therefore, in addition to our state variables, we need to be able to produce conditioned, condensed, memoizable qualifier signatures that can be exchanged globally through the program.

Here is our final shopping list, per sideband:

1x Qualifying - our local value state within the function
1x Map: Block -> Value -> Qualifying - our per basic block bookkeeping of states
1x Qualifier - our importable and exportable value state
1x Function -> QualifiedSignature map - how we tell callers what goes in and what comes out

Processing Flow

In order to simulate our runtime values accurately, we need to update our qualifiers as the function is typed. This means equipping each produced Value with a Qualifying state within the local block, as it is produced. We discern between these value types:

Parameter

Each function parameter is associated with a Qualifying state which is translated from the Qualifier of the template type. We need a function here that receives all qualifiers and all parameters and maps each parameter to a local qualifying state.

Call

This is the mirror image of typing a function. We must export our qualifying states as qualifiers, type a template with them, verify the signature and translate the result type's qualifier to a local qualifying state.

Instruction

Instructions are similar to calls, except they are often polymorphic and only informally specified, and we do not need an export/import step for qualifiers. Based on the instruction name, we look at and validate the qualifying state of arguments, and then define the qualifying state for the instruction result.

Switch, CondBr

We take Switch here as a more general case of CondBr, conditional breaks, and both work roughly the same, except CondBr only uses booleans to decide and produces only two cases.

Each switch case reflects a different reality that our program is forking to. Eventually these different realities need to merge into a common one, and we will cover that in the next section. All we need to do here for each cases' block is to copy the qualifying map of the parent block, as in the beginning, each parallel reality starts out with the same starting conditions. Perhaps we need to make an adjustment to some states here depending on the case we have entered, but that is specific to the kind of sideband we are adding.

Merge, Repeat, Return, Raise

The counterpiece to each Switch, we are trying to join different branches, different realities to a single one by use of one of the four block terminators. At this point we merely record the return paths, but we can only begin to merge once all return paths are known, which is at merge label, loop and function boundaries.

Label, Function

To discern the return type of a label, we need to enumerate all merge instructions returning here and merge the qualifying states an argument has in each individual block we are returning from. In order to achieve this, it might be necessary to generate additional code into the returning block to ensure that realities match sufficiently. On the example of borrow checking: if a value was destroyed in block A, but not in block B, we must generate a destructor for this value in block B so that the two realities match as they merge up. If we can't ultimately find a way to merge two qualifying states, we must give up and produce an error to the user.

When the label has been processed, the qualifying states of values which are still accessible have been merged into the block containing the label, and we are able to continue typing from there.

For the function boundary, the process is the same, except we now merge return paths and raise paths.

Loop

Loops are special in that we must ensure that the final state of parent values in the loop block match the initial state. For example, it is illegal to move a value from a parent scope inside the loop, as we can't perform this operation more than once.

Function Qualifiers

If we wish to annotate a particular quality not covered by a function return type (the function exception type is an example for such an attribute, but we can also imagine e.g. a set of globals that the function will mutate), we need a qualifier for the function itself. These qualifiers are typically discovered during calls, and propagated to the qualifier of the function we are typing.

Caveats

I see a major hindrance in the implementation of such sidebands at user level in the fact that many functions have already been typed by the time the sideband is registered. One simple, somewhat costly fix is to invalidate the template cache and force a re-typing of all existing functions this way.

Another problem is dealing with functions imported from other languages. A sideband must be able to deal with the fact that a callable offers no sideband information.

We can possibly solve both problems by specifying, as with borrow checking, that only values of types supporting this sideband are processed by it. It is impossible to create types supporting a sideband that doesn't exist, and we limit the number of functions affected by a new sideband.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment