Skip to content

Instantly share code, notes, and snippets.

Embed
What would you like to do?
Convert Git logs to JSON. The first script (git-log2json.sh) is all you need, the other two files contain only optional bonus features 😀
#!/usr/bin/env bash
# Use this one-liner to produce a JSON literal from the Git log:
git log \
--pretty=format:'{%n "commit": "%H",%n "author": "%aN <%aE>",%n "date": "%ad",%n "message": "%f"%n},' \
$@ | \
perl -pe 'BEGIN{print "["}; END{print "]\n"}' | \
perl -pe 's/},]/}]/'
#!/usr/bin/env bash
# OPTIONAL: use this stand-alone shell script to produce a JSON object
# with information similar to git --stat.
#
# You can then easily cross-reference or merge this with the JSON git
# log, since both are keyed on the commit hash, which is unique.
git log \
--numstat \
--format='%H' \
$@ | \
perl -lawne '
if (defined $F[1]) {
print qq#{"insertions": "$F[0]", "deletions": "$F[1]", "path": "$F[2]"},#
} elsif (defined $F[0]) {
print qq#],\n"$F[0]": [#
};
END{print qq#],#}' | \
tail -n +2 | \
perl -wpe 'BEGIN{print "{"}; END{print "}"}' | \
tr '\n' ' ' | \
perl -wpe 's#(]|}),\s*(]|})#$1$2#g' | \
perl -wpe 's#,\s*?}$#}#'
/*
* OPTIONAL: use this Node.js expression to merge the data structures
* created by the two shell scripts above
*/
var gitLog, lstat;
gitLog = require('git-log.json');
lstat = require('git-stat.json');
gitLog.map(function(o){
o.paths = lstat[o.commit];
});
#!/usr/bin/env bash
# OPTIONAL: Use jq to merge the two JSON files.
jq --slurp '.[1] as $logstat | .[0] | map(.paths = $logstat[.commit])' git-log.json git-stat.json
@sdqali

This comment has been minimized.

Copy link

sdqali commented Jun 26, 2012

Thanks @textarcana for this

@CyberShadow

This comment has been minimized.

Copy link

CyberShadow commented Sep 11, 2012

Warning: this doesn't escape special characters, so it will break on any commit message that contains double-quotes or backslashes.

@xero

This comment has been minimized.

Copy link

xero commented Dec 4, 2012

awesome! thanx!

@dreamyguy

This comment has been minimized.

Copy link

dreamyguy commented Feb 9, 2013

I've been using a similar solution on my gitconfig, and outputting it from terminal with $ git gitlogtojson > ChangeLog.json. It works great, but I really would like to include the output of --shortstat in the same json item (it does print for each item, but outside the json item). Anyone got it done?

@textarcana

This comment has been minimized.

Copy link
Owner Author

textarcana commented Nov 3, 2013

@CyberShadow: A quick workaround for escaping special characters in commit messages would be to use %f instead of %s in the format string:

           %f: sanitized subject line, suitable for a filename

I've updated the gist to use %f — thanks for pointing this out!

If you need more powerful escaping then see this answer on StackOverflow.

But if your primary need is to do full-text searching of commit messages (or patches for that matter) then command-line Perl is probably not the right tool. Instead I would take a look at this article on indexing Git logs with Solr.

@textarcana

This comment has been minimized.

Copy link
Owner Author

textarcana commented Nov 3, 2013

@dreamyguy I also need --shortstat in JSON format so I've added a shell script for that. The keys for the JSON object are commit hashes so you can easily cross-reference or merge it with a JSON dump of the Git log.

@mbreslow

This comment has been minimized.

Copy link

mbreslow commented Oct 22, 2014

Thanks

@ip2k

This comment has been minimized.

Copy link

ip2k commented Jan 27, 2015

Here's what I use, with sample output:

git log -1 --pretty=format:'----- DEPLOYMENT REPO INFO -----%n{%n  "commit": "%H",%n  "author":     "%an",%n  "author_email": "%ae",%n  "date": "%ad",%n  "message": "%f"%n}'
----- DEPLOYMENT REPO INFO -----
{
  "commit": "2df29ded7842d54d2ccc00903ebb7f5a05e55850",
  "author": "ip2k",
  "author_email": "someone@example.com",
  "date": "Mon Jan 26 11:56:33 2015 -0800",
  "message": "safe-commit-message"
}
@fedesg

This comment has been minimized.

Copy link

fedesg commented Nov 11, 2015

thank you man!

@alexBaizeau

This comment has been minimized.

Copy link

alexBaizeau commented May 3, 2016

I achieved a good level of double quotes escaping by doing the following

git --no-pager log     --pretty=format:'{%n  111555commit666222: 111555%H666222,%n  111555author666222: 111555%an <%ae>666222,%n  111555date666222: 111555%ad666222,%n  111555message666222: 111555%s %n %b666222},'     $@ | sed 's/"/\\"/g' |  sed 's/111555/"/g' | sed 's/666222/"/g' | perl -pe 'BEGIN{print "["}; END{print "]\n"}' | perl -pe 's/},]/}]/' 

It's not super pretty but it escaped the double quotes in the commit messages.

@dreamyguy

This comment has been minimized.

Copy link

dreamyguy commented May 22, 2016

@textarcana cheers for your reply! Your approach is very interesting, specially because some commits simply don't have stats - something I found out recently. It was the cause why my attempt at parsing the git log with --shortstat failed on a few repos.

I have revisited my 3 years old project determined to get it to be as robust as possible, while rendering the majority of pretty=format: placeholders, and I think I've nailed it: https://github.com/dreamyguy/gitlogg

Some of Gitlogg's features are:

  • Parse the git log of multiple repositories into one JSON file.
  • Introduced repository key/value.
  • Introduced files changed, insertions and deletions keys/values.
  • Introduced impact key/value, which represents the cumulative changes for the commit (insertions - deletions).
  • Sanitise double quotes " by converting them to single quotes ' on values that allow user input, like subject.
  • Nearly all the pretty=format: placeholders are featured.
  • Easily include / exclude which keys/values will be parsed to JSON by commenting out/uncommenting the available ones.
  • Thoroughly commented code.
  • Script execution feedback on console.
  • Error handling (since path to repositories needs to be set correctly).
@textarcana

This comment has been minimized.

Copy link
Owner Author

textarcana commented Mar 31, 2017

@dreamyguy gitlogg looks amazing. Will be trying this out at work next week!

@amosshi

This comment has been minimized.

Copy link

amosshi commented Aug 4, 2017

Show more fields in the git log

git log \
    --pretty=format:'{%n  "commit-hash": "%H",%n  "abbreviated-commit-hash": "%h",%n  "author-name": "%an",%n  "author-email": "%aE",%n  "author-date": "%aD",%n  "subject": "%s",%n  "sanitized-subject-line": "%f",%n  "body": "%b",%n  "raw-body": "%B",%n  "commit-notes": "%N"%n},' \
    $@ | \
    perl -pe 'BEGIN{print "["}; END{print "]\n"}' | \
perl -pe 's/},]/}]/'  >> log-json.txt

Example result

{
  "commit-hash": "4f7abc00b5690c0b358f365ad5612cda6fbf96e7",
  "abbreviated-commit-hash": "4f7abc0",
  "author-name": "abcde",
  "author-email": "abcde@0ec998aa-f9da-4275-8eea-c4a1938c285e",
  "author-date": "Thu, 24 Nov 2016 08:44:24 +0000",
  "subject": "Unblocking Check-ins via TeamCity caused by fix for ABC",
  "sanitized-subject-line": "Unblocking-Check-ins-via-TeamCity-caused-by-fix-for-ABC",
  "body": "Reverting changes of ABC",
  "raw-body": "Unblocking Check-ins via TeamCity caused by fix for ABC
Reverting changes of DEF
",
  "commit-notes": ""
},
@Biacode

This comment has been minimized.

Copy link

Biacode commented Nov 8, 2017

Thanks

@greggman

This comment has been minimized.

Copy link

greggman commented Apr 8, 2018

@amosshi, your example is not valid json. Json does not allow multi-line strings. Line breaks have to be escaped into \n A valid json parser will reject that output. I have no idea if the others have the same issue but yours has an example that's clearly not json

@ORESoftware

This comment has been minimized.

Copy link

ORESoftware commented Apr 16, 2018

this works:

git log --pretty='{"author_email":"%ae"}'

you don't need to do this:

git log --pretty=format:'{"author_email":"%ae"}'
@haydenk

This comment has been minimized.

Copy link

haydenk commented May 26, 2019

Why add the new lines to "prettify" the json? You can use something like jq if you want a prettified version to read it better if you want.

git log --pretty=format:'{"commit": "%H","author": "%aN <%aE>","date": "%ad","message": "%f"},'

git log --pretty=format:'{"commit": "%H","author": "%aN <%aE>","date": "%ad","message": "%f"},' | jq .

@damonCY

This comment has been minimized.

Copy link

damonCY commented Aug 6, 2019

git log --format='%aE' | sort -u | while read name; do echo -en "$name\t"; git log --author="$name" --pretty=tformat: --numstat | awk '{ add += $1; subs += $2; loc += $1 - $2 } END { printf "added lines: %s, removed lines: %s, total lines: %s\n", add, subs, loc }' -; done

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.