Skip to content

Instantly share code, notes, and snippets.

@richfitz
Last active September 16, 2020 10:28
Show Gist options
  • Save richfitz/72ac6cd41c2b531a89f1 to your computer and use it in GitHub Desktop.
Save richfitz/72ac6cd41c2b531a89f1 to your computer and use it in GitHub Desktop.
Unify (and unixify) line endings through a git repository's history

Basic strategy same as here but having some trouble with getting find to behave like the article, and working around files that have spaces or other special character in them (especially apostrophes, as they are in people's names).

The fix-eol.sh file is basically the same as the articles, except that we use a for loop over xargs so that the name escaping can be done.

The fix-eol-1.sh file converts all line endings to Unix for a single file. The issues are mostly Mac, but this does Windows -> Unix first so that the \r in Windows files isn't also changed to a \n (giving \n\n).

Run by doing

git filter-branch --tree-filter '~/Documents/Projects/baad/fix-eol.sh' --prune-empty -- --all

Note that this is slow. Running it took around an hour - at each step it's fully reading and writing every csv file in the project, for every commit.

#!/bin/sh
# Convert Windows -> Unix
perl -pi -e 's/\r\n/\n/g' "$1"
# Convert *old* Mac -> Unix (thanks Excel).
perl -pi -e 's/\r/\n/g' "$1"
# Some files had double newlines (\n\n), probably due to previous line ending fixes
perl -pi -e 's/^\n//' "$1"
#!/bin/sh
find data -name '*.csv' -print0 | while read -d $'\0' file
do
/path/to/fix-eol-1.sh "$file"
done
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment