peff/README.md

## README.md

      
    Raw
  

              README.md
            
          
    Rebase conflicts can sometimes be harder than merge conflicts. This is
because merges look only at the end points (what you have, and they
have, and the merge base). But the rebase will look through each commit
of the rebased topic. This can manifest itself in two ways:


You'll get more conflicts in two stretches of code that reach
similar endpoints, but get there through differing paths. For
example, consider this trivial example:
  $ echo base >file && git add file && git commit -m base
  $ git tag base
  $ echo master >file && git commit -a -m master
  $ echo shared >file && git commit -a -m shared
  $ git checkout -b other base
  $ echo other >file && git commit -a -m other
  $ echo shared >file && git commit -a -m shared

When you merge the two, we will see these endpoints:
    ours: shared
  theirs: shared
    base: base

so the resolution is obviously "shared". But if you rebase, you'll
get conflicts, because we'll first try to apply the commit turning
"base" into "other", compared to the other side turning "base" into
"shared". And then after resolving that, you get another conflict,
because the second commit expects to turn "other" into "shared".
Obviously this example is completely trivial and stupid. But
similar things can happen in real projects. For example, two
branches refactor a function similarly. Or even cherry-pick or
merge from a third branch, which pulls in the exact same changes.
And the changes don't have to be exact. Let's imagine the "other"
branch actually did this:
  $ echo other >file && git commit -a -m other
  $ echo shared >>file && git commit -a -m shared

In other words, appending shared to the file instead of replacing
it. In the merge case, you get a conflict, but the conflict looks
like:
   <<<<<<< HEAD
   other
   =======
   >>>>>>> master
   shared

which is fairly readable; you see that they both ended up with
"shared", but the "other" branch has some extra text in it. You can
keep the text or not, but the similar parts are not in question.
But for the rebase case, you still end up dealing with the two
conflicts. The first one like:
  <<<<<<< HEAD
  shared
  =======
  other
  >>>>>>> other

where the other will depend on how you resolve that.
Again, this is a trivial case. But it models what happens when two
developers refactor a function. Let's say we both realize that
"foo" needs a new parameter; you call it "a" and I call it "b". The
merge result shows that the only difference is the name "a" versus
"b". But during the rebase, you may actually see more complex
conflicts if it took several commits to make the same refactoring.


You may have noticed the second issue already cropping up in the
first example. If you touch the same area of code over multiple
commits (which is not uncommon), then conflicts in that area will
keep coming back over and over, as the changes cascade through the
newly rewritten commits. For example, let's say the merge-base has:
  void foo(int a) {
    return 5 * a;
  }

On one side, we refactor foo into this:
  void foo(int a) {
    return 10 * a;
  }

And on the other side, we add some more logic to foo, in a series
of commits, like:


handle negative foo
void foo(int a) {
  if (a < 0) a = 0;
  return 5 * a;
}


handle large foo
void foo(int a) {
  if (a < 0) a = 0;
  if (a > 10) a = 10;
}


If we merge the results, we'll get a single conflict like:

       void foo(int a) {
       <<<<<<< HEAD
         if (a < 0) a = 0;
         if (a > 10) a = 10;
         return 5 * a;
       =======
         return 10 * a;
       >>>>>>> master
       }

which isn't great, but you can see what's going on and what needs
to be merged, and it's a single conflict. During a rebase, you'll
get:

       void foo(int a) {
       <<<<<<< HEAD
         return 10 * a;
       =======
         if (a < 0) a = 0;
         return 5 * a;
       >>>>>>> handle negative foo
       }

 which is fine. It's no more or less complex than the merge case.
 But after you resolve it, then you get hit with the second commit's
 conflict:

       void foo(int a) {
         if (a < 0) a = 0;
         <<<<<<< HEAD
           return 10 * a;
         =======
           if (a > 10) a = 10;
           return 5 * a;
         >>>>>>> handle large foo
       }

 Which is basically the same conflict again. Doing it twice isn't so
 bad, but doing it over a 10-commit series is quite awful. And it
 gets even more complex if you had to significantly modify your
 lines to resolve prior commits during the rebase. Every subsequent
 commit will look like it's trying to revert the conflict
 resolutions you already put in, which can be quite confusing.

 All of which should make sense. The point of rebase is to move your
 commits within history. So in effect, you have to rewrite each
 commit as you would have written it had you done it at the moment
 of the new location in history. Sometimes it is easier to just say
 "these things happened independently, and here is what the combined
 result looks like" (i.e., a merge).