$ git [git options] user-stats [git-log options]
$ git user-stats
Email Commits Files Insertions Deletions Total Lines
----- ------- ----- ---------- --------- -----------
john.smith@gmail.com 289 35 5361 3293 8654
joe.dirt@yahoo.com 142 17 2631 1756 4387
jack.bauer@fbi.gov 115 9 1407 1107 2514
$ git -C path/to/repo user-stats --since="1 week ago"
Email Commits Files Insertions Deletions Total Lines
----- ------- ----- ---------- --------- -----------
joe.dirt@yahoo.com 20 3 83 634 717
john.smith@gmail.com 21 2 242 110 352
Download the script, give it executable permissions, and stick it somewhere in your path. e.g.:
wget -O ~/bin/git-user-stats https://gist.githubusercontent.com/shitchell/783cc8a892ed1591eca2afeb65e8720a/raw/git-user-stats
chmod +x ~/bin/git-user-stats
cd ~/path/to/repo
git user-stats --since="1 week ago"
Basically it uses git log --format="author: %ae" --numstat
(minus any empty lines or binary files) to generate output that looks like:
author: bob.smith@gmail.com
1 147 foo/bar.py
0 370 hello/world.py
author: john.smith@aol.com
7 6 foo/bar.py
author: jack.bauer@fbi.gov
1 0 super/sekrit.txt
author: john.smith@aol.com
2 1 hello/world.py
Each section that starts with author: ...
is a single commit. The first column of --numstat
is the number of insertions, and the second column is the number of deletions for that file.
It then walks over each line with awk
. Whenever it hits a line that starts with author:
, it stores the 2nd column of that line (the email address of the author for that particular commit) in the variable author
and increments that user's total number of commits. For each subsequent line, it updates the number of insertions, deletions, and files for that user until it hits the next line that starts with author:
. Rinse and repeat until it's done.
At the end, it sorts by the total line changes (insertions + deletions) and prints out all of the collected stats. If you wanted to sort by something else, you would simply replace the total
array with the relevant array in the asorti(...)
function. e.g., to sort by number of files, you would change that line to:
n = asorti(files, sorted_emails, "@val_num_desc");
note the function allows for passing custom git log
arguments :D
The git log
output is run through:
tr '[A-Z]' '[a-z]'
to normalize email addresses. My company capitalizes email addresses a laJohn.Smith@TheCompany.com
, and depending on where / how a user is making their commit, that email might show up capitalized or all lowercase. This ensures that all instances of a particular email address are always grouped together regardless of capitalization.grep -v '^$'
to remove empty lines that show up by default in the log outputgrep -v '^-'
to remove the--numstat
info for binary files, which looks like:
- - foo/bar.png
Just for the record, this was originally posted at https://stackoverflow.com/a/73781404/7562633
It was so legendary that I answered 2 questions to increase my reputation, ONLY to upvote this!