myitcv/text.md Secret

## text.md

      
    Raw
  

              text.md
            
          
    Emulating Gerrit workflow on branch-oriented collaboration tools

Paul Jolly (paul@myitcv.io) - 2020-04-13
There are many features of a Gerrit-based workflow that are attractive. When compared against alternative collaboration tools like GitHub, the most important features (in the context of this proposal) are:

the commit is the unit of change that is ready to be contributed
the commit message is a crucial part of that change and is reviewed along with the code
seamless support for chains of changes

These features:

support review/merge one commit at a time, as opposed to multiple commits per GitHub Pull Request for example
allow the developer to continue work on different changes whilst earlier work is being reviewed
encourage people to split unrelated changes into separate commits
encourage reviewing and merging work in smaller chunks, especially when those chunks are chained together

This proposal describes a mechanism by which a Gerrit-style workflow can be achieved in non-Gerrit, branch-oriented collaboration tools. The motivating example is GitHub and so the explanation that follows uses GitHub terms (although in theory the same approach could readily be used with GitLab etc).
Assumptions


The remote branch-oriented collaboration tool is assumed to be configured as a remote named origin. The published branch is assumed to be origin/master
The user will install the git-badger tool. git-badger is very similar to the git-codereview tool, but for branch-oriented collaboration tools
git-badger will have a branchpoint command that behaves identically to git-codereview branchpoint
git-badger will have a rebase-work command that behaves identically to git-codereview rebase-work
git-badger will have a change command that behaves identically to git-codereview change
git-badger will have a mail command that is similar to git-codereview mail. It pushes pending change commits, but varies in some important behavioural aspects covered below
git-badger will have a hooks command that behaves identically to git-codereview hooks

Terminology/key concepts


The user will maintain branches as they wish. We shall call these "work branches" to distinguish from automatically created "Pull Request branches" below
Like Gerrit-based workflows, all commits require a change ID
The variable $changeid is used to refer to the change ID of a given pending change $commit

Detail


First the user creates a work branch using git badger change mywork.
They then make the relevant changes, git add, then git badger change
They might then make further changes, git add and git badger change to create a second commit on the mywork branch
Assuming the current branch is mywork, the pending changes can be listed by git log $(git badger branchpoint)..
git badger mail pushes pending change commits in the current branch to special Pull Request branches
git badger mail [-r remote] pushes to remote, or if the -r flag is not provided it behaves like a plain git push command (see "Additional Thoughts" below). We refer to this remote as $workremote
git badger mail works in the following way:

For each pending change, $commit, in the current work branch, git badger mail checks whether the tag _$changeid.mailed exists and points at $commit (i.e. the commit in mywork identified by $changeid). In case neither of those conditions are satisfied, it creates a local Pull Request branch named _$changeid, pointing to $commit
It force-pushes the _$changeid branch to $workremote
It ensures a pull request exists against the _$changeid branch. We now assume the number of that Pull Request is given by $pr
It ensures the title of $pr matches the first line of $commit's commit message
It ensures the description of $pr matches the body of $commit's commit message
If a pending change's previous commit is not the branchpoint, then the title of $pr is prefixed with [DO NOT MERGE] (see "Additional Thoughts" section below). This is necessary because "later" Pull Request branches reference the commits from "earlier" Pull Request branches. Hence merging a "later" Pull Request (via whatever strategy) will also merge the commits from "earlier" Pull Requests. Therefore we only want the earliest un-picked Pull Request to be mergeable.
At this point the push is complete; the local Pull Request branch _$changeid is removed, and $commit is tagged with _$changeid.mailed


As changes earlier in a work branch are merged (remembering that later changes should not be merged because of the [DO NOT MERGE] prefix), later changes will need to be re-mailed
Continuing our example, if git badger mail detects that changes in mywork have been merged (which is now a trivial exercise because we have change IDs per commit), it will automatically try to adjust mywork to be rebased on top of origin/master minus those merged commits. If the rebase fails then the user will have to intervene, fix and continue the rebase, and re-run git badger mail. If the rebase succeeds (or none is required), then git badger mail will proceed as usual
git badger mail's automatic rebasing can be controlled by the -rebase flag which takes the values auto (default, rebase when needed, notify the user via stderr when this happens) on (always rebase) and off (do not rebase)

Additional Thoughts


When working with forks, one useful pattern is to name the upstream origin and your fork $USER. Then in those repos where you push to the fork, set git config remote.pushdefault $USER. This means git badger mail can be used without the -r flag
At the time of writing (Apr 2020) the GitHub Pull Request draft feature is only available to enterprise users. For enterprise users, git badger could theoretically use the Pull Request draft property instead of prefixing the Pull Request title with [DO NOT MERGE]
It would be trivial to create a simple GitHub action that fails if a Pull Request title starts with [DO NOT MERGE]. It might even make sense to add a git badger command that generates a workflow file (this does run the risk of being a bit GitHub specific however)
GitHub or any other collaboration tool will never be able to emulate the Gerrit server acting as the single source of truth for the mapping between change ID and CL. Guards will be put in place within the git badger tool, but it is theoretically possible in extremely rare, contrived edge cases for multiple Pull Requests to exist for the same change ID at any given time, or for a change ID to be reused. The focus of this proposal is, rather, on helping alleviate individual developer workflows as opposed to fully emulating Gerrit, and assumes good actors who don't try to trick the tool!

Thanks

Thanks to Daniel Martí (mvdan@mvdan.cc) for discussing and reviewing this proposal.
Version 1: 2020-04-20