Create a gist now

Instantly share code, notes, and snippets.

What would you like to do?
Bash script to generate churn counts in git repo
churn number and file name
git log --all -M -C --name-only | grep -E '^(app|lib)/' | sort | uniq -c | sort | awk 'BEGIN {print "count,file"} {print $1 "," $2}'
churn number and file name w/ limiting to last n commits
git log --all -n 5000 -M -C --name-only | grep -E '^spec/models' | sort | uniq -c | sort | awk 'BEGIN {print "count,file"} {print $1 "," $2}'
graph of churn number and frequency
git log --all -M -C --name-only | grep -E '^(app|lib)/' | sort | uniq -c | sort | awk '{print $1}' | uniq -c | sort | awk 'BEGIN { print "frequency,churn_count"} { print $1,$2}'
Owner

coreyhaines commented Feb 16, 2011

I pipe this into a file, then I can load it into a spreadsheet.

Help me make it better!

leshill commented Feb 16, 2011

No need for the first uniq: sort, count, then sort by count.

I think your first uniq will lose some data, since uniq will blow away consecutive edits (and only those). Is that on purpose?

If not, I think skipping that would be an improvement.

Owner

coreyhaines commented Feb 16, 2011

Thanks, Colin, I just caught that when explaining it to sarah. :)

Owner

coreyhaines commented Feb 16, 2011

Here's the chart it generated
http://vurl.me/ZPQ

If you switch the columns and tab-separate them, they import into google docs for charting much more easily :)

awk 'BEGIN {print "file\tcount" } {print $2 "\t" $1}'
Owner

coreyhaines commented Feb 16, 2011

Thanks, Ben. I was doing a bit to make it comma delimited, trying to learn a bit more awk. Here's what I added:
git log --all -M -C --name-only | grep -E '^(app|lib)/' | sort | uniq -c | sort | awk '{print $1,",",$2}'

Owner

coreyhaines commented Feb 16, 2011

Next up is to iterate over date ranges, so I can draw graphs over time of our codebase

I approve.

But, I'm not convinced of the utility of graphs over time. I've generated many of them over the years and can't remember making a change informed by them. I think they're driven by my insecurity about my work.

However, I am convinced of the utility of insight into summarized repository state. I'd use this as a utility to ask "where do I need to focus my thinking?", not "where did I fail to focus in the past?"

If you scriptify it, I recommend passing $* to the initial git log. That way I can say git_churn --since='1 month ago' to see how I'm doing right now, which is what I care about.

Owner

coreyhaines commented Feb 16, 2011

Gary,

I think the graphs over time would be more of an interest from an archeological perspective, rather than changing my current habits.

Thanks for the tip on passing $* to it. I'll do that when I build a script. Or, I could just make a function to put in my bash_profile, no? Maybe I'll ping you to help me.

Yeah, it'd be fine as a function. A script makes it slightly more reusable for others since they just drop the file any where on their $PATH.

Owner

coreyhaines commented Feb 16, 2011

True about making it more reusable. I could put it into my dotfiles repo.

Owner

coreyhaines commented Feb 16, 2011

Or, you could, and I can just copy it. HAHA!

Done. https://github.com/garybernhardt/dotfiles/blob/master/bin/git-churn

I removed your grep for 'app|lib'. You can just pass directories straight to git log to log them. This script should work exactly like yours did.

Owner

coreyhaines commented Feb 16, 2011

Awesome, Gary! Thanks!

This might need some more work, haven't done anything with it for awhile, but this tracks files, classes, and methods for a ruby project

https://github.com/danmayer/churn

Great, this helped me a lot. Thanks!

Xodarap commented Mar 9, 2017

If you add -n to the final sort it will sort numerically instead of alphabetically

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment