Skip to content

Instantly share code, notes, and snippets.

@peff
Last active August 29, 2015 14:16
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save peff/3626e28a402f626ba60e to your computer and use it in GitHub Desktop.
Save peff/3626e28a402f626ba60e to your computer and use it in GitHub Desktop.
explanation of merge versus rebase

Rebase conflicts can sometimes be harder than merge conflicts. This is because merges look only at the end points (what you have, and they have, and the merge base). But the rebase will look through each commit of the rebased topic. This can manifest itself in two ways:

  1. You'll get more conflicts in two stretches of code that reach similar endpoints, but get there through differing paths. For example, consider this trivial example:

      $ echo base >file && git add file && git commit -m base
      $ git tag base
      $ echo master >file && git commit -a -m master
      $ echo shared >file && git commit -a -m shared
      $ git checkout -b other base
      $ echo other >file && git commit -a -m other
      $ echo shared >file && git commit -a -m shared
    

    When you merge the two, we will see these endpoints:

        ours: shared
      theirs: shared
        base: base
    

    so the resolution is obviously "shared". But if you rebase, you'll get conflicts, because we'll first try to apply the commit turning "base" into "other", compared to the other side turning "base" into "shared". And then after resolving that, you get another conflict, because the second commit expects to turn "other" into "shared".

    Obviously this example is completely trivial and stupid. But similar things can happen in real projects. For example, two branches refactor a function similarly. Or even cherry-pick or merge from a third branch, which pulls in the exact same changes.

    And the changes don't have to be exact. Let's imagine the "other" branch actually did this:

      $ echo other >file && git commit -a -m other
      $ echo shared >>file && git commit -a -m shared
    

    In other words, appending shared to the file instead of replacing it. In the merge case, you get a conflict, but the conflict looks like:

       <<<<<<< HEAD
       other
       =======
       >>>>>>> master
       shared
    

    which is fairly readable; you see that they both ended up with "shared", but the "other" branch has some extra text in it. You can keep the text or not, but the similar parts are not in question. But for the rebase case, you still end up dealing with the two conflicts. The first one like:

      <<<<<<< HEAD
      shared
      =======
      other
      >>>>>>> other
    

    where the other will depend on how you resolve that.

    Again, this is a trivial case. But it models what happens when two developers refactor a function. Let's say we both realize that "foo" needs a new parameter; you call it "a" and I call it "b". The merge result shows that the only difference is the name "a" versus "b". But during the rebase, you may actually see more complex conflicts if it took several commits to make the same refactoring.

  2. You may have noticed the second issue already cropping up in the first example. If you touch the same area of code over multiple commits (which is not uncommon), then conflicts in that area will keep coming back over and over, as the changes cascade through the newly rewritten commits. For example, let's say the merge-base has:

      void foo(int a) {
        return 5 * a;
      }
    

    On one side, we refactor foo into this:

      void foo(int a) {
        return 10 * a;
      }
    

    And on the other side, we add some more logic to foo, in a series of commits, like:

    • handle negative foo

      void foo(int a) {
        if (a < 0) a = 0;
        return 5 * a;
      }
      
    • handle large foo

      void foo(int a) {
        if (a < 0) a = 0;
        if (a > 10) a = 10;
      }
      
If we merge the results, we'll get a single conflict like:

       void foo(int a) {
       <<<<<<< HEAD
         if (a < 0) a = 0;
         if (a > 10) a = 10;
         return 5 * a;
       =======
         return 10 * a;
       >>>>>>> master
       }

which isn't great, but you can see what's going on and what needs
to be merged, and it's a single conflict. During a rebase, you'll
get:

       void foo(int a) {
       <<<<<<< HEAD
         return 10 * a;
       =======
         if (a < 0) a = 0;
         return 5 * a;
       >>>>>>> handle negative foo
       }

 which is fine. It's no more or less complex than the merge case.
 But after you resolve it, then you get hit with the second commit's
 conflict:

       void foo(int a) {
         if (a < 0) a = 0;
         <<<<<<< HEAD
           return 10 * a;
         =======
           if (a > 10) a = 10;
           return 5 * a;
         >>>>>>> handle large foo
       }

 Which is basically the same conflict again. Doing it twice isn't so
 bad, but doing it over a 10-commit series is quite awful. And it
 gets even more complex if you had to significantly modify your
 lines to resolve prior commits during the rebase. Every subsequent
 commit will look like it's trying to revert the conflict
 resolutions you already put in, which can be quite confusing.

 All of which should make sense. The point of rebase is to move your
 commits within history. So in effect, you have to rewrite each
 commit as you would have written it had you done it at the moment
 of the new location in history. Sometimes it is easier to just say
 "these things happened independently, and here is what the combined
 result looks like" (i.e., a merge).
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment