Skip to content

Instantly share code, notes, and snippets.

@paniq
Last active July 22, 2024 22:30
Show Gist options
  • Save paniq/935430f6600953af5aab799d0f5a8d01 to your computer and use it in GitHub Desktop.
Save paniq/935430f6600953af5aab799d0f5a8d01 to your computer and use it in GitHub Desktop.

Practical Type Coercion

A quick summary of how the author sees type coercion / casting / conversion in programming languages, followed by a somewhat novel proposal.

Type casting is a special form of functional programming, where we rewrite a function B(source:A, options...) as a cast operation A -> B<options...>, where B is now a type template, and options... is an optional list of auxiliary compile time constants that become attributes of the type template.

Type casting is effectively data conversion. You have source data in format A that you would like to transform to target data in format B. And as in actual data conversion, we have two levels of quality:

  1. Lossless conversion, which transforms the data without destroying information, which means the conversion is invertible, the mapping is bijective and time-symmetrical. Some examples of a lossless conversion are "big endian to little endian", "upcast type to tagged union", "sign-extend integer from 32-bit to 64-bit", "rawstring to voidstar", "invert sign of integer" or "WAV to FLAC". This kind of conversion can be conveniently implicit in code, as an unexpected conversion can correct itself without the user even noticing.
  2. Lossy conversion, which does destroy information. This isn't a bijective mapping but a time-directed reduction. Some examples of lossy conversions are "32-bit integer to boolean", "float to integer", "integer to hash" or "WAV to MP3". Lossy conversion must always be explicit in code (or implicit in places where they are obvious, such as the predicate of a conditional which always needs a boolean reduction), so that it self-documents where information degrades.

Furthermore, a sufficiently complex program can end up with multiple conversion paths between two types, some more direct than others. And because we talk about paths, that means that typecast resolution is pathfinding. And because it is pathfinding, there is a cost metric that can be applied to type conversion: some conversion paths are cheaper to execute in terms of computing cycles than others, and some conversion paths are lossier than others.

Coercion is Pathfinding

This is the novel bit that I wanted to get to: assigning a cost to a conversion makes it easier for the compiler to pick a path, as we can apply a search algorithm such as A* to solve a request. If it does not apply a cost, then the user is forced to disambiguate an implicit conversion (type coercion), even though in their view, one conversion is clearly superior to the other. But, for example in C++ (and to my knowledge every other language out there), there is no way to communicate that opinion to the compiler. Neither can I, as the user, tag a conversion as lossy, nor does the compiler count the cycles/operations needed to perform a conversion. I can not even presently interrogate the compiler about how it would perform a particular unambiguous conversion.

These considerations render users conservative towards adding implicit conversion paths, and they tend to prefer to keep all operations explicit to keep oversight. This is unfortunate, for multiple reasons: a) bijective conversions are inefficient but harmless, since they're unnoticeable b) explicit conversions may preempt the best option, preventing the compiler from finding a more efficient path, and we will have to rely on optimization passes to elide redundant operations.

Coercion is Expressive

As type conversion is just a form of functional programming, a compiler with "smart coercion" may support a wholly new programming style. Users may find themselves attaching more attributes to types and adding a rich set of conversion routines in order to leverage the power of compile time pathfinding in systems programming languages. What sets this method apart from traditional functional programming is perhaps best demonstrated on an example.

Say we perform a transformation through two functions, g(f(x:X, f-options...)->Y, g-options...)->Z. Rewriting this into cast-style, we get X -> Y<f-options...> -> Z<g-options...>. Since the auxiliary parameters to the conversion become part of the type, we are now in the unique situation that we can match a particular combination of (f-options..., g-options...) in the conversion of Y to Z. Languages like Haskell with powerful pattern matching already support such matches in function application, but systems programming languages typically do not. This would provide a way for programmers to get access to such powerful facilities without losing the ability to reason about the performance of their programs. On the contrary, since the pathfinder always minimizes the cost of a conversion (as it is estimated at compile time), they can be sure that solutions bias towards efficiency and losslessness.

Syntax Considerations

Here's one way the feature could be exposed in a C-ish language: we have two separate dispatching operations, bijective_cast<T>(x) for lossless operations and reduce_cast<T>(x) for lossy ones. reduce_cast would always first attempt a bijective_cast before it does a lossy one. A cast operation A -> B<...> could also be specified in constructor form B(x:A, ...).

On the other end we can then specify conversion patterns between two types that build the edges of the conversion graph which operators can then find a path through. For each pattern A<...> -> B<...>, we specify a function that can perform the operation, as well as a compile time expression that estimates the cost as a positive real value, with inf acting to communicate impossibility and nan becoming a poison value that produces a compile time error.

To ensure that bijective conversions are indeed bijective, we require that for a conversion operation A -> B, the user also specifies the inverse operation B -> A. This ensures that users choose the correct cast category when defining new conversions, and allows us to automate testing of bijective operations by asserting that under fuzzy parameters, the assumption x == y holds with a bijective conversion x:A -> B -> y:A as well as x:B -> A -> y:B.

Furthermore, some operations are only bijective at compile time (such as truncating a 64-bit constant to 32-bit when the upper 32 bits are zero) - these would become so-called "static bijective conversions", and only become available when the arguments are constants.

The Cost of Composition

A complication of this setup is that a conversion may perform explicit conversions in its body but report a lower cost than it would take to perform those on the main path, effectively providing a shortcut that may turn out to be "the scenic route". In addition, with such functions, the user must take care that such implicit costs are always considered in the explicit cost estimate reported for the operation.

It would therefore be ideal if the compiler were always able to infer and propagate costs in composite conversions, so that false shortcuts are prevented and users are freed from the burden of explicit accounting where the solution is obvious.

Coercion As Far As The Eye Can See

Through the relationship B(source:A, options...) ⇔ source:A -> B<options...>, and allowing types to have variadic arguments, we are able to support an arbitrary number of arguments and results. A function f(a:A, b:B, ...) -> U, V can be rewritten as {a, b}:pair<A, B> -> f<U, V, ...> conversion.

With this, we have the full lambda calculus at our disposal, with the ability to execute loops and recursively aggregate parameters into compile time lists, which introduces the halting problem to type conversion: as parameters are added to a type, the dimensionality of the explored space increases, and the pathfinder may find itself exploring paths that just go on forever, because they are either the seemingly cheapest option or the only one. The traditional solution here is to limit the number of steps: if the pathfinder takes too long, the resolution is aborted. Tuning this limit to a sensible maximum may prove difficult. We may assign a higher resolution cost to steps that do not perform any work (read: do not generate code) and only modulate types. When code is generated as part of a conversion, then the instruction limits of a module are eventually reached. When no code is generated and only types are modulated, then the resolution cost increases and eventually reaches its limit, inviting the user to either fix the loop, or avoid attempting to solve a complex problem at compile time.

Implementation Notes

Attempting an implementation, this section is expanded as I experiment and learn.

Presently trying to implement smart coercion with A-star, i recognize several snags:

  1. There's no heuristic distance estimator h(x) that sharpens the search towards the shortest euclidean distance, since the type space is non-euclidean. I think. it would be cool if it were. We can in any case not heuristically guess from any two types what the effort would be to transform one to the other. This is pretty much what we need this algorithm for. Thus we'd set h(x)=0, which turns our algorithm into Dijkstra's algorithm. However, to improve convergence, it is necessary to provide some heuristic here. I'm getting significant improvements simply by using the absolute difference in bytes between the source and the target type.

  2. There can be a shit ton of neighbors to a type, sometimes even an infinite number when a parameter is continuous. Take for example integers: each integer can be conceptually converted to at least 2 * 64 different types, but practically way way more than that. Here it does help to know what the goal type is, since we can then say "oh we can actually portal right over there" and provide the best fitting type. But it is still possible that for some types, regardless of knowing the goal, the number of neighbors seems at first uncountable until we know a path.

    Needing to make them countable, we need to provide a limited expressive subset that can be combined through multiple steps (as a contrived example, instead of 2 * 64 integers, we only return integers whose bitwidth is +1, *2, -1, /2 our source bitwidth - or we offer variations where in each, a different bit is flipped), and then, somehow, recognize combinable sequences in the path (e.g. i16 -> i32 -> i64 can be shortened to i16 -> i64) and reduce them. We can at any point in the A* algorithm reconstruct our path through the comeFrom map, so it is possible to backtrack to the origin and query for each step if there's a shortcut. If we find one, then we can update comeFrom, as well as the cost metric for our neighbor. This approach of course significantly increases the cost of discovering a new path, the longer our path is, so there's room for improving the algorithm further here.

  3. Searching from both ends supports both cast-from and cast-to operations. A solution is found when our best frontier node appears in the expanded nodes of the opposite frontier.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment