Skip to content

Instantly share code, notes, and snippets.

@zerowidth
Last active July 18, 2017 17:50
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save zerowidth/4684495 to your computer and use it in GitHub Desktop.
Save zerowidth/4684495 to your computer and use it in GitHub Desktop.
A brief discusson on git workflow preferences (now published on zerowidth.com)

A colleague asks [slightly edited]:

Why didn't you squash your [feature branch merge] commit? I'm new to [project], but I found on [other project] that squashed commits made it much easier to git bisect. And since the [project] test suite is broken frequently, I assume we'll be git bisecting a lot. Un-squashed commits tended to leave broken tests or functionality that never actually shipped.

Git, because it's just "the stupid content tracker", is flexible enough to support just about any development workflow you can think of. You can use pull requests only (e.g. GitHub), use a complex release cycle (nvie's git-flow), or have a gatekeeper with deputies and a blessed repo (the Linux kernel). Because of this, your choice of workflow is a question of philosophy rather than anything the tool itself imposes.

Some argue that the role of a revision control system is the preserve history exactly as it was, without exception. Others, including many who commit code in our codebase, prefer to liberally rewrite and squash commits before pushing to the master branch.

I fall somewhere in between on this spectrum. When working with features, I stand by the "branch for everything" model. The question is, when and how should these branches be applied to master?

My goal when adding code to the master branch is to preserve both a clear history in the overall repository as well as the incremental development of distinct features.

For small features--one or two simple commits--the simplest thing to do is squash the commits and cherry-pick or fast-forward merge them back onto master. For larger features, especially ones that take more than a few minutes of effort and span many commits, I rebase and merge.

First, I rebase, because it helps keep the history generally linear. Along with rebasing, however, I make use of --interactive, including cleaning up fixup commits (e.g. "fixup: whoops, forgot a file") to clean up the feature branch a little. I strive to have each commit be self-contained, including tests for that piece of the work. I'd like for each of my feature branches to have a clear flow of how I did the work, piece by piece.

Secondly, I merge with --no-ff. This is also what the "merge" button in GitHub does. I avoid fast-forward merges so that distinct features remain identifiable. For example:

before merge:           with fast-forward:    with --no-ff:

o - step 2 (feature)    o - step 2 (master)   o - merged feature into master
|                       |                     |\
o - step 1              o - step 1            | o - step 2
|                       |                     | |
o - before (master)     o - before            | o - step 1
|                       |                     |/
.                       .                     o - before
.                       .                     |
                                              .
                                              .

When a feature branch is merged using fast-forward, each commit of a feature is flattened and their identity as a group is lost. By forcing a merge commit, each feature remains evident as a distinct unit of work, even if nothing else was committed to master in the meantime.

I don't squash these feature branches because to do so loses something important: how the feature came about. The master branch should not only show what was developed but also how. This is especially important for refactoring and refurbishment work since the end result can be so drastically different. A merged branch preserves this information.

Additionally, squashing feature branches breaks attribution. When more than one person works on a feature, it's helpful to know that developer A worked on the CSS and developer B did the model and controller code. If it's all squashed together, there's no way to tell who was responsible for a particular change.

There are a few downsides to what I recommend:

  • Reverts are more complicated. Rather, reverting the revert is complicated, because git can get confused about what commits have been applied to the repo already.
  • You said git bisect is more difficult, but I'm not sure I buy this. If the commit before a merge is good, and the merge commit is bad, then it's the feature branch that broke it. Also, ideally, each commit in a branch is self-contained in that the tests all pass, so it's easier to track down what broke.

When following a process closer to GitHub's, using pull requests and that ever-so-convenient big green "Merge pull request" button, it's good to keep in mind that they have other affordances helping the PR-only workflow. Most importantly, they have a CI system that runs the full suite on each feature branch. We don't have that convenience, so we have more work to do to verify things before merging (or squashing and cherry-picking, as the case may be).

As a footnote, I consider a note about git commit messages to be required reading, regardless of whatever else you do with your repositories.

Further reading:

@ck37
Copy link

ck37 commented Jul 18, 2017

Here is the correct link for the "git-rebase" further reading: http://blog.izs.me/post/37650663670/git-rebase

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment