Skip to content

Instantly share code, notes, and snippets.

@Pseudomanifold
Last active April 27, 2023 14:58
Show Gist options
  • Star 4 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save Pseudomanifold/d7b996492dc3020f2acea87b49cc54c3 to your computer and use it in GitHub Desktop.
Save Pseudomanifold/d7b996492dc3020f2acea87b49cc54c3 to your computer and use it in GitHub Desktop.
Counting words in TeX files under version control in order to depict the temporal evolution of the document. Useful for creating graphs such as the one shown in PhD comics #1915, http://phdcomics.com/comics/archive.php?comicid=1915.
#!/bin/bash
#
# This script traverses all commits of a git repository in the current directory
# and counts the number of words that are changed, i.e. added or deleted for all
# TeX files. The output contains a time-stamp (YYYY-MM-DD). Next, there are some
# counts, viz. the number of added, deleted, and total words.
#
# You can use `gnuplot`, for example, to create nice visualizations from the raw
# data. For this, it may be useful to specify the first column as a date column:
#
# set xdata time
# set timefmt "%Y-%m-%d"
#
# Following this, you may plot the desired information. Use 'branch' in order to
# control for which branch the commits are being enumerated.
branch="master"
words=0
for commit in $(git rev-list --reverse $branch)
do
date=$(git show -s --format=%cd --date=short $commit)
added=$(git show -p --word-diff=porcelain $commit "*.tex" | grep -e '^+[^+]' | wc -w)
deleted=$(git show -p --word-diff=porcelain $commit "*.tex" | grep -e '^-[^-]' | wc -w)
words=$(($words+$added))
words=$(($words-$deleted))
echo $date $added $deleted $words
done
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment