Skip to content

Instantly share code, notes, and snippets.

@textarcana
Last active March 1, 2024 05:26
Show Gist options
  • Save textarcana/1306223 to your computer and use it in GitHub Desktop.
Save textarcana/1306223 to your computer and use it in GitHub Desktop.
Convert Git logs to JSON. The first script (git-log2json.sh) is all you need, the other two files contain only optional bonus features 😀THIS GIST NOW HAS A FULL GIT REPO: https://github.com/context-driven-testing-toolkit/git-log2json
#!/usr/bin/env bash
# Use this one-liner to produce a JSON literal from the Git log:
git log \
--pretty=format:'{%n "commit": "%H",%n "author": "%aN <%aE>",%n "date": "%ad",%n "message": "%f"%n},' \
$@ | \
perl -pe 'BEGIN{print "["}; END{print "]\n"}' | \
perl -pe 's/},]/}]/'
#!/usr/bin/env bash
# OPTIONAL: use this stand-alone shell script to produce a JSON object
# with information similar to git --stat.
#
# You can then easily cross-reference or merge this with the JSON git
# log, since both are keyed on the commit hash, which is unique.
git log \
--numstat \
--format='%H' \
$@ | \
perl -lawne '
if (defined $F[1]) {
print qq#{"insertions": "$F[0]", "deletions": "$F[1]", "path": "$F[2]"},#
} elsif (defined $F[0]) {
print qq#],\n"$F[0]": [#
};
END{print qq#],#}' | \
tail -n +2 | \
perl -wpe 'BEGIN{print "{"}; END{print "}"}' | \
tr '\n' ' ' | \
perl -wpe 's#(]|}),\s*(]|})#$1$2#g' | \
perl -wpe 's#,\s*?}$#}#'
/*
* OPTIONAL: use this Node.js expression to merge the data structures
* created by the two shell scripts above
*/
var gitLog, lstat;
gitLog = require('git-log.json');
lstat = require('git-stat.json');
gitLog.map(function(o){
o.paths = lstat[o.commit];
});
#!/usr/bin/env bash
# OPTIONAL: Use jq to merge the two JSON files.
jq --slurp '.[1] as $logstat | .[0] | map(.paths = $logstat[.commit])' git-log.json git-stat.json
@damonCY
Copy link

damonCY commented Aug 6, 2019

git log --format='%aE' | sort -u | while read name; do echo -en "$name\t"; git log --author="$name" --pretty=tformat: --numstat | awk '{ add += $1; subs += $2; loc += $1 - $2 } END { printf "added lines: %s, removed lines: %s, total lines: %s\n", add, subs, loc }' -; done

@textarcana
Copy link
Author

I just wanted to say thanks to all of you who have commented on this gist with improvements in recent years! I'm so happy that this script is still inspiring people to advanced the cause of structured SCM log analytics!

@textarcana
Copy link
Author

textarcana commented Feb 1, 2020

@haydenk

You can use something like jq

Heh. jq 1.0 release 2015, this script first draft was 2011.

But yes, today I would implement things somewhat differently using jq 😀

@textarcana
Copy link
Author

textarcana commented Feb 17, 2020

There is now a Git repo for the code in this gist, which even in 2020 I think is still relevant (despite there now being several other solutions to choose as well).

If you have commented above with regard to code changes, you could now move your comment into a full-on pull request and I would certainly consider all improvements!

https://github.com/context-driven-testing-toolkit/git-log2json

@hard-coded0
Copy link

Show more fields in the git log

git log
--pretty=format:'{%n "commit-hash": "%H",%n "abbreviated-commit-hash": "%h",%n "author-name": "%an",%n "author-email": "%aE",%n "author-date": "%aD",%n "subject": "%s",%n "sanitized-subject-line": "%f",%n "body": "%b",%n "raw-body": "%B",%n "commit-notes": "%N"%n},'
$@ |
perl -pe 'BEGIN{print "["}; END{print "]\n"}' |
perl -pe 's/},]/}]/' >> log-json.txt
Example result

{
"commit-hash": "4f7abc00b5690c0b358f365ad5612cda6fbf96e7",
"abbreviated-commit-hash": "4f7abc0",
"author-name": "abcde",
"author-email": "abcde@0ec998aa-f9da-4275-8eea-c4a1938c285e",
"author-date": "Thu, 24 Nov 2016 08:44:24 +0000",
"subject": "Unblocking Check-ins via TeamCity caused by fix for ABC",
"sanitized-subject-line": "Unblocking-Check-ins-via-TeamCity-caused-by-fix-for-ABC",
"body": "Reverting changes of ABC",
"raw-body": "Unblocking Check-ins via TeamCity caused by fix for ABC
Reverting changes of DEF
",
"commit-notes": ""
},

How would I show the files in this as well?

@stephencmorton
Copy link

I achieved a good level of double quotes escaping by doing the following

git --no-pager log     --pretty=format:'{%n  111555commit666222: 111555%H666222,%n  111555author666222: 111555%an <%ae>666222,%n  111555date666222: 111555%ad666222,%n  111555message666222: 111555%s %n %b666222},'     $@ | sed 's/"/\\"/g' |  sed 's/111555/"/g' | sed 's/666222/"/g' | perl -pe 'BEGIN{print "["}; END{print "]\n"}' | perl -pe 's/},]/}]/' 

It's not super pretty but it escaped the double quotes in the commit messages.

I'm not sure why different special character sequences are used for "opening double quote" and "closing double quote". I don't think that provides any benefit. Just use 111555 (or whatever) for both.

(I know the original post I'm commenting on is very old, but it and my comment on it are both still relevant today.)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment