HEY: I've turned this into a blog post, which is a little more in depth.
🚨 https://eev.ee/blog/2016/06/04/converting-a-git-repo-from-tabs-to-spaces/ 🚨
-
Fix any inconsistent indentation in your existing files, or Python code will break, since it considers a tab to be 8 and we're about to make it 4.
-
Populate
.gitattributes
in your repository, as below.*.py filter=spabs
You may want more filetypes; just add more lines with different extensions.
Optionally, commit it. DO NOT PUSH YET.
-
Run
expand
manually on your entire repository. (TODO how to do this, and/or how to make git do it.) Commit. DO NOT PUSH YET. -
By hand of God, big scary emails, or perhaps by editing
/etc/gitconfig
on all your developers' machines, give the chunk of.gitconfig
below to all of your contributors.[filter "spabs"] clean = expand --initial -t 4 smudge = expand --initial -t 4 required [merge] renormalize = true
-
Now you push.
Note that this will not keep tabs in the repository and spaces in a checkout or whatever other nonsense. This will convert tabs to spaces, permanently, period, everywhere.
-
Anyone checking the repository out will just get spaces, because that's what git's storing now. The filter will run all the time and replace any new tabs before they can be committed.
-
Anyone with an inflight branch will see tabs on that branch, because the
.gitattributes
file won't exist yet. -
Anyone who merges an inflight branch with master will have their branch transparently renormalized before git tries to merge, thanks to
merge.renormalize
. After the merge, the branch will have spaces. Most likely the developer will never notice anything changed at all. (This also applies in the other direction: if other work happens on master while you're detabbing in a branch, you can merge master in seamlessly. Either way,.gitattributes
ends up in the merged result, and that's what Git uses.) -
Anyone who rebases an inflight branch is totally fucked, because
merge.renormalize
doesn't apply to rebasing. So you must send out another BIG SCARY EMAIL informing all your rebasing jerks that they must pass-Xrenormalize
anytime they rebase a tabbed branch. This will more explicitly do the same thing that happens for merging. (It works for merging, too, but since there's a config flag there's not much reason to use it there. Also the same applies tocherry-pick
and other ways of rearranging commits.) -
New files on inflight branches WILL NOT be de-tabbed during the merge—they were only changed on one side, so git sees no reason to merge them! But git will still consider their "canon" representations to be spaces, so
git diff
will claim that every single indented line has "changed" from tabs to spaces, even if the file on disk still contains tabs.git checkout
orgit reset --hard
will not make the "changed" files go away.It's possible to fix this with a clever git hook that applies the filter to new files during a merge, but it's not that huge a problem in practice:
git status
will report the files as modified immediately following the merge and they can be committed then. If more work is done before someone notices,git diff -w
will still confirm the "useful" part of the change. -
Stashes will not apply cleanly, and
git stash apply
seems to ignore-X
. There are two workarounds:-
Convert the stash to a branch with
git stash branch
, then merge or rebase it in. -
Apply the stash manually with e.g.
git cherry-pick 'stash@{0}' -n -m 1 -Xrenormalize
. You need the-m 1
because a stash is actually a merge of several distinct commits that hold different parts of the stash, andcherry-pick
wants to know which parent to diff against.-n
just prevents committing, so you don't end up with "WIP: ..." as a commit message.
-
Of course, anyone without the filter definition somewhere in git's configuration will be utterly confused. So this probably only works for fairly centralized development or very small teams.
-
.gitattributes
is cool but is not a magic bullet. Whenever you ask git to look at a file, it will always report seeing spaces—but if you put tabs in a file on disk, they'll stay there until you ask git to update the file (via merge, etc.). Confusion will abound, especially in Python files. You can force a checkout withgit checkout-index --force [files...]
. -
Eventually you should let your developers know that they can drop whatever
.vimrc
et al. hacks they've been using to force tabs within your codebase. -
This may balloon your reflog, but git stores binary patches, it's all the same character so it's very amenable to gzip anyway, and
git gc
will eventually take care of it. -
If you feel particularly destructive, you can also put the attribute stuff in
/etc/gitattributes
and have it apply to files in all git repositories on the entire machine. -
Blame is not, in fact, totally wrecked. Use
git blame -w
to ignore whitespace-only changes. -
ONLY DO THIS IF YOU ARE ABSOLUTELY SURE YOU WILL NEVER CHANGE YOUR MIND.
This is an awesome mini-guide. I’ve never come across
required
orrenormalize
, and it was fascinating interesting to read about them—I have to say they are both rather elusive topics. I’ve never even seen those properties mentioned in the Git Pro book! I wish there was a Ninja emoji, because you sir earned the title Ninja.