Skip to content

Instantly share code, notes, and snippets.

@edmundsmith
Created November 22, 2021 14:42
Show Gist options
  • Star 12 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save edmundsmith/1876aa1c014c4b8ce4ab4cfabcb1c497 to your computer and use it in GitHub Desktop.
Save edmundsmith/1876aa1c014c4b8ce4ab4cfabcb1c497 to your computer and use it in GitHub Desktop.
Things I wish $Lang had

Things I wish $Lang had

Sometimes, when programming, I notice that I'm missing some useful features, or when reading I think 'this would be really useful to be part of $Lang'.

Explicit order-invariance and synchronisation points

C compilers are part of why C is hard. They re-order statements based on what appears to be a magic black box. Understanding what's in that black box is much harder than it should be. Many statements (or expressions) in programming act completely independently of each other, and thus their order of execution should not matter. Meanwhile, other statements can depend on the execution of a previous statement (/expression). It would be nice to see the execution dependency graph explicitly and syntactically laid out, rather than being inferred from analysis or assuming every statement to induce a synchronisation point.

Some languages get part of this right. Haskell's laziness means order of execution doesn't need to match the order it's written down, and its purity means that order of execution also shouldn't affect the computation anyway. Many languages support explicit async execution, using await to denote a point whose successful evaluation depends explicitly on the completion of the await-ed-upon computation, but this only applies to async code, when really we want all code to have this feature.

I wish languages would go further in this regard, especially the impure/stateful ones. Doing this in Python looks rather more like I'm calling a clunky DSL than actually using Python itself, as everything has to be made asynchronous and wrapped up in await asyncio.gather(foo := a(), bar := b(), baz := c()), when really I want to write something more like

foo = a()
bar = b()
baz = c()
# We have no idea what order they are executed, whether they are 'async' or 'sync' functions,
# or even that they've been started yet

# Block until all three are completed
# Before: {foo, bar, baz} ~ FutureLike<PythonVal>
# After: {foo, bar, baz} ~ PythonVal
sync foo, bar, baz

qux = d(foo, bar, baz)

Ideally, we could use separation logic to ensure that the order-independent statements truly were independent, but if this were something like Python they'd probably just leave that up in the air, like most other things important to software engineering (see: type hinting - "we'll add it to the language, but it's unspecified behaviour how it affects execution").

On top of having made explicit the re-orderings that already occur when nobody's looking, freeing the implementations from order-observability should simplify things, or at least make them less chaotic. In a perfect world, it becomes clear how bonkers it is that in C, mutating a variable can make the evaluation order 'observable' and lead to de-optimisation. We haven't even got a clear definition of 'observable' to work with! This whole thing is only implicit to code and languages and only emergent out of how much out-of-order-execution matters to performance.

I'd love to see this improved. A bit like the borrow checker in Rust, it may lead to some teething pains, but I think that for most situations, it would be mighty helpful to have a compiler moaning about your implicit or sloppy induced synchronisation points.

Maybe we can give it a catchy slogan for people to feel smug about being jargonauts: "explicit syntactic lattices!" or "semi-orderful evaluation!" (or rather don't, programming is much better without the gatekeeping).

Polarised Types

Polarised Logic is interesting - it allows the description of, and reasoning with, both positive and negative types. We can get something close to this already with functions, if we take the inputs to be negative and the outputs to be positive. Thus, we might consider A -> B to be a type (-A, +B).

In fact, Prolog uses something very similar to describe the behaviour of predicates (~ functions) in the presence of bidirectional evaluation/unification.

Polarised types appear to map nicely to the ideas of session types, which allow the description of protocols in the type system, and to game semantics, which offer an interpretation of computation and logic as the interaction of (generally competing) agents.

Session types are a great idea that should be considered by people designing protocols, as they introduce a rigour in implementation that can be otherwise lacking.

Game semantics are potentially useful for computer security and practical computing, as they provide a framework for analysing competitive or adversarial interactions in a system (think fair resource scheduling, or designing protocols for network security).

When also enriched with Linear types, Polarised type systems also appear to map interestingly to semantic models of quantum computing, perhaps bridging a gap between qubit/wire behaviour and classical type systems. Things like entanglement and destructive measurements can be described with polarised linear types.

It's probably possible to realise a (horrifically ugly) version of Polarised (Linear) Types today in languages with linearity, such as Idris or Rust, by (ab)using the concept of input:output :: negative:positive, but I don't know whether this transform captures polarised linear logic completely.

Simpler threading, fibre-ing and SIMD

I want concurrency to be trivial to add, not require a separate threading/fibres library and defining whole functions for each worker thread. I want parallelism to be trivial to add, not require a separate threading/BLAS library or relying on compiler intrinsics. Currently these range from simple to oh-god-I-should-become-a-hermit. Simple is not enough, we should be able to trivialise this.

OpenMP gets part of this right - parallel for loops are very easy to add for a quick speedup most of the time. Rayon also gets part of this right - parallel iterators are usually simple to add.

The closest thing I've seen to trivial concurrency is Scala's Future { <expr> }. Even Go's much lauded go statement requires go func() { <expr> } and needs channels to 'return' any values. Maybe Rust's futures::lazy is close (and just needs a macro such that conc! { <expr> }futures::lazy(|| { <expr> }) ).

Bidirectional/Invertible Macros

People who work with macros are familiar with the idea of viewing a macro expansion: you take your macro invocation, and inspect the code it generates. You can use this to see inside your abstraction, i.e. to de-abstract (adstract?). What does this look like in the opposite direction?

And if you define an invertible macro, could it perform abstractions for you?

A more directly useful thing invertible macros should give is the ability to transpose semantic analyses of the expanded 'source' back into the pre-expansion source. Hopefully this will make things like DSLs or helpers simpler to understand and make error messages much clearer. If a macro can be inverted, then there's nothing special about which side of the 'expansion' you are. This sounds much saner to me.

Attributes and Traits for Functions

Datatypes can be given interfaces, traits or attributes in most languages. I'd like to see this extended to functions, so that we can say something like fn f() where f : Monotonic + Threadsafe, or maybe fn f(); impl Monotonic for f { prfTerm = ... }

Languages have dabbled with a subset of this functionality before, with things like constexpr, pure, unsafe, @nogc and nothrow. However, these prior cases are (almost?) all language built-ins, and as such we cannot define our own without deviating from the standard language. Imagine if we could have a function that guaranteed that it did not allocate - we could then reason about our allocations in the type system, such as having noalloc void fn() function pointers, or noalloc-safe function composition.

Or for a more interesting example, we could define monotonic functions that approach a limit. Computation with the function may appear to be unbounded/non-terminating, but we could provide the function itself with a limit value and a delta-epsilon proof (along the lines of 'give me any arbitrary closeness to the limit, and I will iterate until closer to limit/a function of type e -> (d, (f.limit - f(d)) < e) ).

These ought to not require changing the language/external-to-the-language behaviour: the language ought to be capable of expressing these concepts within itself.

It would be really rather nice if we could use our established familiarity with interfaces/traits to reason about functions in the same way, especially if we could reason about new semantics or constraints without having to entirely abandon our old language (plus its ecosystem) just to get access to memory/thread/termination/whatever-safety.

Or, from a different perspective, given that programming is about composing small functions into larger ones, why is it that data (and its invariants) may be directly safely abstracted, but functions require wrapping?

And yes, algebraic effect systems or dependent monad systems alleviate some of the issues, such as when we want to reason about what a function does. However, it is a bit harder to use them to reason about intention than extention, or to reason about negative conditions (e.g. f never_segfault, as proving f_can_segfault -> absurd is a bit tricky). Or then again, I could be missing some trick that makes existing effects systems 'click'.

By applying these as a language-aware trait/typeclass-like 'thing', hopefully programming with proofs can do away with manually bringing along your collection of proofs for each function the same way traditional typeclasses/traits do away with the manually-lugged dictionary of method implementations.

@edmundsmith
Copy link
Author

A more directly useful thing invertible macros should give is the ability to transpose semantic analyses of the expanded 'source' back into the pre-expansion source. Hopefully this will make things like DSLs or helpers simpler to understand and make error messages much clearer.

You might find @justinpombrio's thesis, Resugaring, of interest!

Thanks for the recommendation, that does look interesting!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment