Skip to content

Instantly share code, notes, and snippets.

@djmattyg007
Last active November 26, 2022 02:08
Show Gist options
  • Save djmattyg007/02f941a80498ed2d51ee77e9706c01fe to your computer and use it in GitHub Desktop.
Save djmattyg007/02f941a80498ed2d51ee77e9706c01fe to your computer and use it in GitHub Desktop.
Releasing new versions of tools and libraries - a manifesto

Releasing new versions of tools and libraries - a manifesto

If you've come from the world of deploying production applications in a commercial setting, you might be familiar with the concept of continuous deployment. Fundamentally, it stipulates that every merge to master main should be deployed. This makes a lot of sense in the age of modern web apps, where your users don't care about the underlying tech and aren't interacting directly with the software.

The world of tools and libraries is totally different. Our software is depended on by other software. It has a contract (implicit or explicit) that other things depend on. It usually has a version number, so we can tell when something has changed. The needs are different, and therefore we need a better strategy for releasing.

What constitutes a release?

A new release should have a reason for existing - a reason that brings value to users, not just toil. End users don't care if we updated our linting tools, for example. Any change that isn't visible to users - such as updating development dependencies or fixing a typo in the readme - don't require new releases, and they shouldn't be mentioned in the changelog either.

This also means changelogs should be hand-crafted. Attempting to generate them automatically from commit messages will only lead to a poor experience for your users. When writing a changelog entry, remember the following:

  • What would you want to see out of a high-quality changelog written by someone else?
  • Focus on the impact to users. If there are breaking changes, you might want to accompany the release with a migration guide.

If you know you're going to be making a bunch of changes close together in time, try to bundle up all of these changes into a single release. This reduces the amount of toil for end users.

Choosing a version number

People tend to put too much stock in version numbers. Version numbers are just numbers. They're cheap, and there's lots of them. If you botch a release, don't panic. Just fix the problem and create another release. What this means is that you shouldn't get too caught up on the numbers getting too high. It's okay! If your next version number is 1.100.0, maybe you should be congratulating yourself for providing so much value to your users without breaking anything :)

If you need to make a breaking change, bump the major version number. Make sure it's justified, and try to bundle it up with some other value to users if possible. Carrot and stick, rather than just stick.

If you're adding something new, bump the minor version number. Keep the patch version number truly for just bugfixes and nothing else. This helps to create trust with your users, and gives them confidence in upgrading to the latest version of your software.

Most importantly, if you're changing something, think long and hard about what that change will actually mean to most users. It's rare that a change in behaviour doesn't break somebody's workflow. Ask yourself - is it actually not a breaking change, or are you just afraid of bumping the major version number? Remember, version numbers are cheap. Don't forget Hyrum's Law either. Just because you think you've clearly documented the public interface of the software, doesn't mean a seemingly innocent change won't break things for somebody.

Having said all that, it's always important to consider the overall impact. For example, consider the situation where you've just created a new release with breaking changes, and you've already bumped the major version number. If you botch that release, and it turns out you need to make more breaking changes, and you've realised that soon after the release, it's probably okay to bump the patch version number for the additional changes. You still need to write a new changelog entry though.

Is a new major version number enough?

Even if you follow semantic versioning, it doesn't deal with the fact that there's a hidden fourth component to your version number. Semantic versioning tells us that a version number looks like <major>.<minor>.<patch>. This is a lie. It actually looks like this: <name>.<major>.<minor>.<patch>.

What does this mean? Imagine the situation where you release a new major version with enough breaking changes all in one go, such that it requires users to fundamentally rewrite their own code, you've probably made a mistake. The project has been taken in such a drastically different direction that it's no longer the same project, and should probably have been released under a new name.

If you have no desire to maintain a legacy codebase any longer, that's fine. Releasing your rewrite under a new name has multiple advantages. For example, most common programming languages don't support multiple versions of the same library being installed simultaneously. By releasing under a new name, it provides a migration path that doesn't involve a "big bang" style operation.

Creating a release

I've seen some setups where the release process closely mimics traditional systems without continuous deployment. There's a single CI pipeline with a block step near the end. Unblocking the pipeline runs a few additional steps that publishes the release and any artifacts. Some of the crazier setups prompt users for the new version number in the CI system when unblocking the pipeline.

All of this is generally a bad idea. Releases should be created by humans from their development machine. This doesn't mean the release process shouldn't be scripted! Sometimes there are necessarily several components to releasing a new version of a tool or library. This should absolutely be automated, utilising a CI pipeline if desired. It's just that all of this should be triggered by a developer running a script on their machine.

Here's what releasing a new version might look like, at a high level:

  1. Git tag created and pushed
  2. GitHub release created
  3. Release artifacts created and uploaded (potentially to multiple places)
  4. Announcement on Slack

Depending on how what you have available to you, it might be that step 1 triggers your release pipeline. Perhaps you use the tag from step 1 to create the Github release in step 2, and that's what triggers your release pipeline instead. Maybe it's all contained within a single script. The details of what your release pipeline looks like isn't that important. The interesting bit is what comes before the tag is created. This is what the start of any release creation script should look like.

First, the script should attempt to validate that the repository is in a good state. For example:

  • There must be no uncommitted changes in the repository.
  • The current branch exists on the remote.
  • All changes on the current branch must have been pushed to the remote.

The idea is that before creating a release, all changes must already be on the remote for all to see, and the changes must have passed a standard CI build. In an ideal world, the script would check with the CI system to ensure this is the case. However, that may be infeasible or too onerous. That's okay - we're not trying to baby developers. Trying to verify everything to the ultimate level of detail is usually not worth the investment. You should already trust developers to do many other things. Due diligence before creating a release is one of those.

Once the script has finished checking things, it should prompt the developer to enter a version number. This number should be the version they're expecting to release. It should be validated, to ensure it follows the versioning format the project expects to be using (e.g. semantic versioning). The script should then perform some additional checks:

  • The specified version must not have been released before. Remember, if you botched a release, use a new number. It's okay.
  • There must be a changelog entry, with an associated date that matches the current day. Yes, your changelog should have dates. Don't force your users to go on an archaeology expedition just to work out how old a particular version is.
  • If it's a library with version metadata stored in a package definition file (e.g. package.json, pyproject.toml, Cargo.toml, etc), make sure the same version is specified there.

Remember, the script has already verified that the repository is in a clean state. This means that your changelog and any other files should already have been updated, committed and pushed.

Finally, the script should push the tag to the remote and the release process kicked off.

What naming convention should your tags follow?

Prefer prefixing your tags with a v. For example, version 1.2.3 would be represented by the tag v1.2.3.

For any long-lived project, it would be folly to think you'd never want to create any tags other than version numbers. Adding a prefix lets you easily filter them out from all your other tags and branches. This may be important when setting up a release pipeline in your CI system, for example.

Resist the temptation to use tags with any words or other characters that aren't strictly related to the version. For example, RELEASE-v1.2.3 is a bad idea - it makes the job of dependency management bots like Renovatebot much more difficult than it needs to be.

Backporting fixes

Sometimes you'll need to backport fixes to older versions of the software. Most of the time, this will usually be across major version boundaries. Sometimes, you might have discovered an accidentally breaking change after the release was created, and decided to roll forwards. Regardless of the reason, it's usually because something has happened that makes it difficult for some of your users to upgrade to the very latest release.

Fortunately, nothing about the release process outlined above is tied to any specific branch. This is on purpose! It should mean that backporting fixes to older versions is (hopefully) as simple as cherry-picking commits onto a branch dedicated for the previous version, and then running the same release creation script.

Start by creating a branch for the previous version. If you've released v2, and you need to backport something to v1, go back to the last commit that didn't involve a breaking change, and create a branch named 1.x. If you've released v3.4.0, and then realised it contained a breaking change, you might name the branch 3.3.x. Don't prefix these branch names with v. They're not specific version numbers, and this could be confusing to your automation (as outlined above).

Now commit the fixes to the dedicated release branch. If you can, cherry-pick an existing commit from the current version. This is preferable, as it preserves ownership information for the person who actually authored the changes. If a patch doesn't apply cleanly to an older version though, that's okay.

Once the branch is ready, just create a release like any other release.

Conclusion

Never forget that each new release prompts users to update to the latest version. This should be front of mind when making any change at all. After all, what is the point of software if not for the users? In the age of Renovatebot and Dependabot, each new release can have a ripple effect. Lots of compute is spent across potentially thousands of projects, running CI pipelines to make sure the new version you just released didn't break anything.

Every time you release a new version, it places a burden on users, and on the environment. Be mindful, care about others, and do the right thing by those around you.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment