Skip to content

Instantly share code, notes, and snippets.

@ibabushkin
Created May 15, 2017 12:21
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save ibabushkin/890ab0a71b6ad4ed2848c4a89e912d1b to your computer and use it in GitHub Desktop.
Save ibabushkin/890ab0a71b6ad4ed2848c4a89e912d1b to your computer and use it in GitHub Desktop.

Semantic Versioning in Rust

The goal of this document is to outline a definition of what items are subject to semantic versioning in a library crate. In the long term, I'm going to implement a tool to automatically check for a crate's compatibility with semantic versioning, given a pair of different versions of the same crate.

We assume version 2.0.0 of the SemVer specification 1. The relevant aspects of it are commented below. The semantics of crate version numbers are already specified at 2, and their interpretation is baked into cargo. This leaves us to define the actual definition of a crate's API and to classify changes to it.

Definition of a crate's API

  1. Software using Semantic Versioning MUST declare a public API. This API could be declared in the code itself or exist strictly in documentation. However it is done, it should be precise and comprehensive.

The public API of a library crate consists of the set of exported function and method signatures, types, traits, and trait implementations, and all export paths these items can be imported through. While a function or method body can introduce breaking or non-breaking changes (see below for a definition) to a crate, this can't be checked in an automated fashion and is outside the scope of this document. Changes to these items have to be carefully reviewed by the crate author.

Definition of changes

The semantic versioning specification defines the classification of changes in "backwards compatible", "backwards incompatible" and "bug fix" in an informal manner. To automate the classification process as far as possible, a definition tailored to the Rust language and thus to specific language features, as well as the API definition given above, is necessary.

  1. Patch version Z (x.y.Z | x > 0) MUST be incremented if only backwards compatible bug fixes are introduced. A bug fix is defined as an internal change that fixes incorrect behavior.

Backwards compatible bug fixes are all changes to the body of public functions or methods. Changes to private items in a crate are allowed as well. Note that we actually allow more changes than the spec, since we cannot verify the changes made to a function definition with regards to it's behaviour.

  1. Minor version Y (x.Y.z | x > 0) MUST be incremented if new, backwards compatible functionality is introduced to the public API. It MUST be incremented if any public API functionality is marked as deprecated. It MAY be incremented if substantial new functionality or improvements are introduced within the private code. It MAY include patch level changes. Patch version MUST be reset to 0 when minor version is incremented.

Backwards compatible (non-breaking) changes are considered to be changes to the type signature of an item that don't change it's semantics in any way, or additions of new export paths for already existing items or addition of new items, provided that no ambiguity can arise due to the addition. Note that generalization of type signatures is a breaking change, since for every instance of a more specialized type of an item there exists a situation where type inference relies on the type information it offers, which is not provided by a more general signature. Note that even the addition of an item or an export path can be considered breaking, since in the case of a wildcard import, the newly created item can create a name clash with user code. However, we specifically ignore this possibility, as this case can be constructed in most languages, and is not assumed to be relevant with regards to semantic versioning by the specification, and guarding against it would reduce the number of allowed changes being considered non-breaking to a very small and impractical set.

  1. Major version X (X.y.z | X > 0) MUST be incremented if any backwards incompatible changes are introduced to the public API. It MAY include minor and patch level changes. Patch and minor version MUST be reset to 0 when major version is incremented.

Backwards incompatible (breaking) changes are considered to be any changes to the public items of an API that aren't covered by the other two classes of changes described above.

Consequences for an implementation

Given this model of crate APIs and changes, this allows us to evaluate the possible approaches to algorithmically identifying different classes of changes between versions.

Before comparing the different crate versions, we need to check whether versions of dependencies have been updated. If so, these changes affect the result of the comparison as well, since compatibility is transitive. For example, if crate a depends on crates b and c, and b depends on c, then b needs to update it's major version, if it begins to depend on a new major version of c, as otherwise a could face a transitive dependency conflict. If such cases are out of the way, the actual analysis can be performed.

The first step is to construct the total set of all items exported by a crate, as well as all the paths to each item. Obviously, identifying paths that reference the same item in one version of the crate should be done here as well. It makes sense to store the information obtained this way in a map-like structure, with export paths as keys and public items as values, where multiple keys can map to the same value and a value has knowledge of all keys pointing to it.

Given such a structure, the identification of path-related manipulations should be straightforward: the symmetric difference of the two sets of export paths can be examined to enumerate removed and newly included paths, recording the results.

After this initial step, each pair of items taken from the old and new set of items, respectively, that shares at least one export path in both versions of the crate is considered to be different versions of the same item (that possibly underwent changes). The changes to it's declaration across versions are then examined in a manner compatible with the specification outlined above. This most likely requires instrumenting the compiler to resolve the types and then to check for equality.

Such an approach is sound with regards to the specification of semantic versioning, as presented in previous sections, and can be implemented in a modular fashion, as obtaining the crate versions to be analyzed, generating interface descriptions, and comparing them for semantic compatibility are separate, well-contained steps. In case this specification has to be extended, for instance to include comparisons across language versions, such changes can be made with reasonable complexity as well.

Implementation details

To conclude this document, a basic overview of the implementation of a tool implementing automated checks of a crate's versioning scheme is given, and serves as guideline for it's future implementation.

Considering that most of the functionality described above requires reusing code from the type checking and name resolution phases normally performed by the compiler, the usage of the librustc family of crates seems to be inevitable, forcing the implementation to target rust nightlies. This directly implies that a design analogous to clippy with a custom driver would be the most efficient way to implement such a system.

The framework in which the actual analysis is performed hasn't yet been described. In a previsous discussion thread 3 it became clear that the most viable solution would consist of a component that pulls the latest published crate version from crates.io and drops it into a temporary directory on disk. Now both versions get run through the compiler, while both instances communicate with each other to allow the actual comparison machinery to run. However, it is worth discussing whether a more compact approach to working with multiple versions of the same crate can be derived.

The actual tool itself would then be run as a cargo subcommand on the crate in the current directory.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment