Skip to content

Instantly share code, notes, and snippets.

@sdboyer
Last active July 27, 2018 01:38
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save sdboyer/8be4cac4d75e75e1c12eece108937076 to your computer and use it in GitHub Desktop.
Save sdboyer/8be4cac4d75e75e1c12eece108937076 to your computer and use it in GitHub Desktop.
incompats with shadows

Note: this is a WIP draft to outline the idea, not polished thoughts - will improve on it as i go

Let's start with a simplifying assumption: everything that's happening is API-level breakages in the v0 range, and that making the API signatures line up is sufficient to guarantee a truly compatible build. This is obviously not true, but it is useful because it means we can establish what ideal behavior looks like, and then seek to emulate it.

Say that we have some module at some version, A@v1.0.0, and another module it depends on, B@v0.1.0. We may reasonably expect that, in any future release, B may change its exported API in a way that is incompatible with the way that A@v1.0.0's consumption of it. This is most likely because Bjorn made a change that fell outside of G1P, but it also may be because A was consuming B in a way that cross-compatibility rules identify as being risky.

At the same time, we must also note that even if B does make a backwards-incompatible change, it does not necessarily mean it was a breaking change for A@v1.0.0, because the latter may not have been consuming that part of the former's API that was changed in a breaking way.

However, no matter which case we're in, the type checker can provide us an accurate answer. Moreover, the type checker can do this without requiring any changes to A@v1.0.0 itself, which ideally should be an immutable artifact. Specifically, proceeding pairwise through the list of all versions of B , we can create a new list - the sublist of B's original version list with which A@v1.0.0 is type-compatible.

There are some possible shapes that list could take, none of which are mutually exclusive:

  • It could have cut off some versions earlier than B@v0.1.0 - an unambiguous indication that A's use of B has some minimum version.
  • It could have cut off some sequence of versions later than B@v0.1.0, starting with some particular version and then continuing to the end of the list. This unambiguously indicates that there is some maximum version, beyond which B has made a change with which A@v1.0.0 is incompatible. Somebody, probably B but possibly A, has violated G1P.
  • There could be versions, individual ones or arbitrarily long sequences of them, with which A@v1.0.0 is incompatible, but where there is some version after that sequence with which A@v1.0.0 is compatible again. This unambiguously indicates a breaking change made by B, which was later reverted.

If we are looking simply at the pass/fail output of the type checker, then it is possible that there may be multiple, distinct incompatibilities might be hiding in a single sequence of incompatible versions. At the level of pure pass/fail analysis (which is where we generally want to stay), we cannot distinguish between these.

All of these boundaries will remain true, no matter when the typechecker does its pairwise evaluation. A new release of A, say A@v1.0.1, may have a completely different cross-compatibility list compared A@v1.0.0.

The question is, can we create a set of declarations, and rules for interpreting them, which allow us to achieve the same effect with behavior-level incompatibilities, without requiring arbitrary modifications of old releases? I think we can. And I think we can do so, in the most stripped-down manner, with exactly two types of declarations: current and incompatible. It's both a little easier to think about, and a little more accurate, if we also add a minimum declaration, though that could technically be expressed in terms of incompatible.

Say that we have a pair of modules, A and B:

ttc-base.png

And that we have a current declaration between A@v1.1.0 and B@v1.1.1 - that is, A@v1.1.0 currently uses B@v1.1.1 in its build:

ttc-current.png

Now, imagine Aparna discovered a problem with B@v1.2.0 and B@v1.3.0, doesn't have time to deal with it right now, so declares them incompatible in a new release, A@v1.1.1, without changing the current version:

ttc-pub-incompat.png

The key thing that we need here is to make sure that these incompatibilities, discovered after Aparna published A@v1.0.0, make it back to that version. So that's what we'd need the algorithm to do - treat the incompatibility set for a particular version as a combination of both its own declarations, and future incompatibility declarations (but, crucially, not past incompatibility declarations):

ttc-back-project.png

Now, solving processes - manual or automated - won't lose context about known incompatibilities as they move backwards through the release history.

Now, that is a double-edged sword - the further we go back in time, some false negatives are likely to occur. This is certainly a cost, though a bearable one, I think. That's especially the case because the only way this system could work is if it supports cases where existing releases get shadowed by later incompatibility discoveries. Those will necessarily occur. So, we exploit the necessity of supporting that outcome, by allowing a mode in which we can force version selection back into the shadow of an incompatibility declaration. This removes essentially all the possible harmful externalities of incompatibility declarations, in a way that reuses existing structures of the problem to its own advantage.

The only thing we're missing is a minimum version, so that solving processes don't backtrack too far on the B side, either. However, if the search processes also incorporate pairwise type-checking, then we're getting at least a baseline minimum, without needing any declarations from humans whatsoever. That is, we're already operating on the subset of the list of versions of B, per the ideal example above.

Let's say that the type-minimum for A@v1.1.0 is B@v1.0.0:

ttc-typemin.png

Incorporating type analysis will eliminate the vast majority of stale minimum cases, especially because it need not be declared. But there are inevitably times when there are behavior-level requirements that are not visible in the exported API. For such cases, we also need an explicit, declared minimum. Let's say that Aparna also made such a declaration on A@v1.1.0:

ttc-decl-min.png

Now, let's say that when Aparna published A@v1.1.1, she also incorporated an exported identifier that was first introduced in B@v1.1.0:

ttc-111-typemin.png

When she goes to publish, It is trivial for the type checker to tell her that the explicitly declared minimum of B@v1.0.1, left over from before, is now stale, because the release she's about to roll couldn't possibly correctly use anything earlier than B@v1.1.0. So she can remove the declaration from the file (though it doesn't hurt anything to leave it in).

Minimum version is an inverted declaration of incompatibility - instead of having to name all the incompatible versions, you just identify the version at which compatibility begins. In a pinch - e.g., if Aparna published a release and forgot to identify the correct minimum version - you can name all preceding versions, and it has the same effect. The problem is that it would also project those incompatibilities backwards to all preceding versions, essentially establishing that minimum across the entire release history of A.

TODO - explore how subsequent releases that resolve incompatibility issues can result in range-capping, other behaviors, etc., through unambiguous inferences drawn from just these declaration structures.

TODO - an additional structure for describing incompatibilities between two dependencies, not between self and dependency

@sdboyer
Copy link
Author

sdboyer commented May 28, 2018

really a first draft, will update as i go

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment