djmattyg007/manifesto.md

## manifesto.md

      
    Raw
  

              manifesto.md
            
          
    Releasing new versions of tools and libraries - a manifesto

If you've come from the world of deploying production applications in a commercial setting, you
might be familiar with the concept of continuous deployment. Fundamentally, it stipulates that every
merge to master main should be deployed. This makes a lot of sense in the age of modern web
apps, where your users don't care about the underlying tech and aren't interacting directly with the
software.
The world of tools and libraries is totally different. Our software is depended on by other
software. It has a contract (implicit or explicit) that other things depend on. It usually has a
version number, so we can tell when something has changed. The needs are different, and therefore we
need a better strategy for releasing.
What constitutes a release?

A new release should have a reason for existing - a reason that brings value to users, not just toil.
End users don't care if we updated our linting tools, for example. Any change that isn't visible to
users - such as updating development dependencies or fixing a typo in the readme - don't require new
releases, and they shouldn't be mentioned in the changelog either.
This also means changelogs should be hand-crafted. Attempting to generate them automatically from
commit messages will only lead to a poor experience for your users. When writing a changelog entry,
remember the following:

What would you want to see out of a high-quality changelog written by someone else?
Focus on the impact to users. If there are breaking changes, you might want to accompany the
release with a migration guide.

If you know you're going to be making a bunch of changes close together in time, try to bundle up
all of these changes into a single release. This reduces the amount of toil for end users.
Choosing a version number

People tend to put too much stock in version numbers. Version numbers are just numbers. They're
cheap, and there's lots of them. If you botch a release, don't panic. Just fix the problem and
create another release. What this means is that you shouldn't get too caught up on the numbers
getting too high. It's okay! If your next version number is 1.100.0, maybe you should be
congratulating yourself for providing so much value to your users without breaking anything :)
If you need to make a breaking change, bump the major version number. Make sure it's justified, and
try to bundle it up with some other value to users if possible. Carrot and stick, rather than just
stick.
If you're adding something new, bump the minor version number. Keep the patch version number truly
for just bugfixes and nothing else. This helps to create trust with your users, and gives them
confidence in upgrading to the latest version of your software.
Most importantly, if you're changing something, think long and hard about what that change will
actually mean to most users. It's rare that a change in behaviour doesn't break somebody's workflow.
Ask yourself - is it actually not a breaking change, or are you just afraid of bumping the major
version number? Remember, version numbers are cheap. Don't forget Hyrum's Law
either. Just because you think you've clearly documented the public interface of the software,
doesn't mean a seemingly innocent change won't break things for somebody.
Having said all that, it's always important to consider the overall impact. For example, consider
the situation where you've just created a new release with breaking changes, and you've already
bumped the major version number. If you botch that release, and it turns out you need to make more
breaking changes, and you've realised that soon after the release, it's probably okay to bump the
patch version number for the additional changes. You still need to write a new changelog entry
though.
Is a new major version number enough?

Even if you follow semantic versioning, it doesn't deal with the fact that there's a hidden
fourth component to your version number. Semantic versioning tells us that a version number looks
like <major>.<minor>.<patch>. This is a lie. It actually looks like this: <name>.<major>.<minor>.<patch>.
What does this mean? Imagine the situation where you release a new major version with enough
breaking changes all in one go, such that it requires users to fundamentally rewrite their own code,
you've probably made a mistake. The project has been taken in such a drastically different direction
that it's no longer the same project, and should probably have been released under a new name.
If you have no desire to maintain a legacy codebase any longer, that's fine. Releasing your rewrite
under a new name has multiple advantages. For example, most common programming languages don't
support multiple versions of the same library being installed simultaneously. By releasing under a
new name, it provides a migration path that doesn't involve a "big bang" style operation.
Creating a release

I've seen some setups where the release process closely mimics traditional systems without
continuous deployment. There's a single CI pipeline with a block step near the end. Unblocking the
pipeline runs a few additional steps that publishes the release and any artifacts. Some of the
crazier setups prompt users for the new version number in the CI system when unblocking the
pipeline.
All of this is generally a bad idea. Releases should be created by humans from their development
machine. This doesn't mean the release process shouldn't be scripted! Sometimes there are
necessarily several components to releasing a new version of a tool or library. This should
absolutely be automated, utilising a CI pipeline if desired. It's just that all of this should be
triggered by a developer running a script on their machine.
Here's what releasing a new version might look like, at a high level:

Git tag created and pushed
GitHub release created
Release artifacts created and uploaded (potentially to multiple places)
Announcement on Slack

Depending on how what you have available to you, it might be that step 1 triggers your release
pipeline. Perhaps you use the tag from step 1 to create the Github release in step 2, and that's
what triggers your release pipeline instead. Maybe it's all contained within a single script. The
details of what your release pipeline looks like isn't that important. The interesting bit is what
comes before the tag is created. This is what the start of any release creation script should look
like.
First, the script should attempt to validate that the repository is in a good state. For example:

There must be no uncommitted changes in the repository.
The current branch exists on the remote.
All changes on the current branch must have been pushed to the remote.

The idea is that before creating a release, all changes must already be on the remote for all to
see, and the changes must have passed a standard CI build. In an ideal world, the script would check
with the CI system to ensure this is the case. However, that may be infeasible or too onerous.
That's okay - we're not trying to baby developers. Trying to verify everything to the ultimate level
of detail is usually not worth the investment. You should already trust developers to do many other
things. Due diligence before creating a release is one of those.
Once the script has finished checking things, it should prompt the developer to enter a version
number. This number should be the version they're expecting to release. It should be validated, to
ensure it follows the versioning format the project expects to be using (e.g. semantic versioning).
The script should then perform some additional checks:

The specified version must not have been released before. Remember, if you botched a release, use
a new number. It's okay.
There must be a changelog entry, with an associated date that matches the current day. Yes, your
changelog should have dates. Don't force your users to go on an archaeology expedition just to
work out how old a particular version is.
If it's a library with version metadata stored in a package definition file (e.g. package.json,
pyproject.toml, Cargo.toml, etc), make sure the same version is specified there.

Remember, the script has already verified that the repository is in a clean state. This means that
your changelog and any other files should already have been updated, committed and pushed.
Finally, the script should push the tag to the remote and the release process kicked off.
What naming convention should your tags follow?

Prefer prefixing your tags with a v. For example, version 1.2.3 would be represented by the tag
v1.2.3.
For any long-lived project, it would be folly to think you'd never want to create any tags other
than version numbers. Adding a prefix lets you easily filter them out from all your other tags and
branches. This may be important when setting up a release pipeline in your CI system, for example.
Resist the temptation to use tags with any words or other characters that aren't strictly related to
the version. For example, RELEASE-v1.2.3 is a bad idea - it makes the job of dependency management
bots like Renovatebot much more difficult than it needs to be.
Backporting fixes

Sometimes you'll need to backport fixes to older versions of the software. Most of the time, this
will usually be across major version boundaries. Sometimes, you might have discovered an
accidentally breaking change after the release was created, and decided to roll forwards. Regardless
of the reason, it's usually because something has happened that makes it difficult for some of your
users to upgrade to the very latest release.
Fortunately, nothing about the release process outlined above is tied to any specific branch. This
is on purpose! It should mean that backporting fixes to older versions is (hopefully) as simple as
cherry-picking commits onto a branch dedicated for the previous version, and then running the same
release creation script.
Start by creating a branch for the previous version. If you've released v2, and you need to
backport something to v1, go back to the last commit that didn't involve a breaking change, and
create a branch named 1.x. If you've released v3.4.0, and then realised it contained a breaking
change, you might name the branch 3.3.x. Don't prefix these branch names with v. They're not
specific version numbers, and this could be confusing to your automation (as outlined
above).
Now commit the fixes to the dedicated release branch. If you can, cherry-pick
an existing commit from the current version. This is preferable, as it preserves ownership
information for the person who actually authored the changes. If a patch doesn't apply cleanly to an
older version though, that's okay.
Once the branch is ready, just create a release like any other release.
Conclusion

Never forget that each new release prompts users to update to the latest version. This should be
front of mind when making any change at all. After all, what is the point of software if not for the
users? In the age of Renovatebot and Dependabot, each new release can have a ripple effect. Lots of
compute is spent across potentially thousands of projects, running CI pipelines to make sure the new
version you just released didn't break anything.
Every time you release a new version, it places a burden on users, and on the environment. Be
mindful, care about others, and do the right thing by those around you.