Created
January 5, 2013 11:08
-
-
Save amitchhajer/4461043 to your computer and use it in GitHub Desktop.
Count number of code lines in git repository per user
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
git ls-files -z | xargs -0n1 git blame -w | perl -n -e '/^.*\((.*?)\s*[\d]{4}/; print $1,"\n"' | sort -f | uniq -c | sort -n |
Hello,
Is there any way to count month wise data like from Jan - Mar how many number of code lines in git repository per user?
The other commands here took hours for our project. Here is a faster method:
- Remember to use
blame.ignoreRevsFile
to ignore mass-edits (like code style fixes). - Use `git ls-files -x "*pdf" -x "*xml"`` to filter out files.
git ls-files | while read i; do git blame $i | sed -e 's/^[^(]*(//' -e 's/^\([^[:digit:]]*\)[[:space:]]\+[[:digit:]].*/\1/' -e 's/[[:blank:]]*$//'; done | sort -f | uniq -ic | sort -rn
Counting only activity last two years:
git ls-files | while read i; do git blame $i --since 2.years | grep -v '^\^' | sed -e 's/^[^(]*(//' -e 's/^\([^[:digit:]]*\)[[:space:]]\+[[:digit:]].*/\1/' -e 's/[[:blank:]]*$//'; done | sort -f | uniq -ic | sort -rn
Solution modified from: https://stackoverflow.com/a/2788077
here's my one-liner:
function gitfilecontributors() { local perfile="false" ; if [[ $1 = "-f" ]]; then perfile="true" ; shift ; fi ; if [[ $# -eq 0 ]]; then echo "no files given!" >&2 ; return 1 ; else local f ; { for f in "$@"; do echo "$f" ; git blame --show-email "$f" | sed -nE 's/^[^ ]* *.<([^>]*)>.*$/: \1/p' | sort | uniq -c | sort -r -nk1 ; done } | if [[ "$perfile" = "true" ]]; then tee /tmp/gitblamestats.txt ; else tee /tmp/gitblamestats.txt >/dev/null ; fi ; echo ; echo "total:" ; awk -v FS=' *: *' '/^ *[0-9]/{sums[$2] += $1} END { for (i in sums) printf("%7s : %s\n", sums[i], i)}' /tmp/gitblamestats.txt | sort -r -nk1 ; fi ; }
or with line breaks:
gitfilecontributors ()
{
local perfile="false";
if [[ $1 = "-f" ]]; then
perfile="true";
shift;
fi;
if [[ $# -eq 0 ]]; then
echo "no files given!" 1>&2;
return 1;
else
local f;
{
for f in "$@";
do
echo "$f";
git blame --show-email "$f" | sed -nE 's/^[^ ]* *.<([^>]*)>.*$/: \1/p' | sort | uniq -c | sort -r -nk1;
done
} | if [[ "$perfile" = "true" ]]; then
tee /tmp/gitblamestats.txt;
else
tee /tmp/gitblamestats.txt > /dev/null;
fi;
echo;
echo "total:";
awk -v FS=' *: *' '/^ *[0-9]/{sums[$2] += $1} END { for (i in sums) printf("%7s : %s\n", sums[i], i)}' /tmp/gitblamestats.txt | sort -r -nk1;
fi
}
usage possible four folder(s) of your choice.
option -f to show per file, otherwise totals only:
$ gitfilecontributors $(fd --type f '.*' source)
total:
139 : somebody@somewhere.de
29 : else.user@somewhere.de
9 : just.another@somewhere.de
gitfilecontributors -f $(fd --type f '.*' source)
source/040_InitialSetup.md
80 : somebody@somewhere.de
29 : else.user@somewhere.de
6 : just.another@somewhere.de
README.md
59 : somebody@somewhere.de
5 : whosthat@somewhere.de
3 : just.another@somewhere.de
total:
139 : somebody@somewhere.de
29 : else.user@somewhere.de
9 : just.another@somewhere.de
5 : whosthat@somewhere.de
this is exactly what i was looking for :)
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Here's a variation on the earlier responses that parallelizes the blame. This can result in a significant speedup if you have multiple cores. This version also supports filenames that may be quoted by 'git ls-files' (tabs, newlines, backslashes, quotes, UTF-8, etc.) or that begin with a "-":