Skip to content

Instantly share code, notes, and snippets.

@coreyhaines
Created February 16, 2011 19:04
Show Gist options
  • Star 14 You must be signed in to star a gist
  • Fork 6 You must be signed in to fork a gist
  • Save coreyhaines/829932 to your computer and use it in GitHub Desktop.
Save coreyhaines/829932 to your computer and use it in GitHub Desktop.
Bash script to generate churn counts in git repo
churn number and file name
git log --all -M -C --name-only | grep -E '^(app|lib)/' | sort | uniq -c | sort | awk 'BEGIN {print "count,file"} {print $1 "," $2}'
churn number and file name w/ limiting to last n commits
git log --all -n 5000 -M -C --name-only | grep -E '^spec/models' | sort | uniq -c | sort | awk 'BEGIN {print "count,file"} {print $1 "," $2}'
graph of churn number and frequency
git log --all -M -C --name-only | grep -E '^(app|lib)/' | sort | uniq -c | sort | awk '{print $1}' | uniq -c | sort | awk 'BEGIN { print "frequency,churn_count"} { print $1,$2}'
@coreyhaines
Copy link
Author

I pipe this into a file, then I can load it into a spreadsheet.

Help me make it better!

@leshill
Copy link

leshill commented Feb 16, 2011

No need for the first uniq: sort, count, then sort by count.

@trptcolin
Copy link

I think your first uniq will lose some data, since uniq will blow away consecutive edits (and only those). Is that on purpose?

If not, I think skipping that would be an improvement.

@coreyhaines
Copy link
Author

Thanks, Colin, I just caught that when explaining it to sarah. :)

@coreyhaines
Copy link
Author

Here's the chart it generated
http://vurl.me/ZPQ

@bleything
Copy link

If you switch the columns and tab-separate them, they import into google docs for charting much more easily :)

awk 'BEGIN {print "file\tcount" } {print $2 "\t" $1}'

@coreyhaines
Copy link
Author

Thanks, Ben. I was doing a bit to make it comma delimited, trying to learn a bit more awk. Here's what I added:
git log --all -M -C --name-only | grep -E '^(app|lib)/' | sort | uniq -c | sort | awk '{print $1,",",$2}'

@coreyhaines
Copy link
Author

Next up is to iterate over date ranges, so I can draw graphs over time of our codebase

@garybernhardt
Copy link

I approve.

But, I'm not convinced of the utility of graphs over time. I've generated many of them over the years and can't remember making a change informed by them. I think they're driven by my insecurity about my work.

However, I am convinced of the utility of insight into summarized repository state. I'd use this as a utility to ask "where do I need to focus my thinking?", not "where did I fail to focus in the past?"

If you scriptify it, I recommend passing $* to the initial git log. That way I can say git_churn --since='1 month ago' to see how I'm doing right now, which is what I care about.

@coreyhaines
Copy link
Author

Gary,

I think the graphs over time would be more of an interest from an archeological perspective, rather than changing my current habits.

Thanks for the tip on passing $* to it. I'll do that when I build a script. Or, I could just make a function to put in my bash_profile, no? Maybe I'll ping you to help me.

@garybernhardt
Copy link

Yeah, it'd be fine as a function. A script makes it slightly more reusable for others since they just drop the file any where on their $PATH.

@coreyhaines
Copy link
Author

True about making it more reusable. I could put it into my dotfiles repo.

@coreyhaines
Copy link
Author

Or, you could, and I can just copy it. HAHA!

@garybernhardt
Copy link

Done. https://github.com/garybernhardt/dotfiles/blob/master/bin/git-churn

I removed your grep for 'app|lib'. You can just pass directories straight to git log to log them. This script should work exactly like yours did.

@coreyhaines
Copy link
Author

Awesome, Gary! Thanks!

@danmayer
Copy link

This might need some more work, haven't done anything with it for awhile, but this tracks files, classes, and methods for a ruby project

https://github.com/danmayer/churn

@huebnerdaniel
Copy link

Great, this helped me a lot. Thanks!

@Xodarap
Copy link

Xodarap commented Mar 9, 2017

If you add -n to the final sort it will sort numerically instead of alphabetically

@fuhrmanator
Copy link

If you remove -all it allows specifying an SHA1 in the log as churn up to that SHA1.

@fuhrmanator
Copy link

I just found https://github.com/AnAppAMonth/git-churn, which is a python solution giving a more detailed interpretation of churn (additions, subtractions).

@flacle
Copy link

flacle commented Nov 2, 2020

Solutions that I've found online looked at changes to files irrespective whether these are new changes or edits to existing lines of code. Hence I made this solution: https://github.com/flacle/truegitcodechurn/

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment