Skip to content

Instantly share code, notes, and snippets.

Embed
What would you like to do?
Convert Git logs to JSON. The first script (git-log2json.sh) is all you need, the other two files contain only optional bonus features 😀
#!/usr/bin/env bash
# Use this one-liner to produce a JSON literal from the Git log:
git log \
--pretty=format:'{%n "commit": "%H",%n "author": "%aN <%aE>",%n "date": "%ad",%n "message": "%f"%n},' \
$@ | \
perl -pe 'BEGIN{print "["}; END{print "]\n"}' | \
perl -pe 's/},]/}]/'
#!/usr/bin/env bash
# OPTIONAL: use this stand-alone shell script to produce a JSON object
# with information similar to git --stat.
#
# You can then easily cross-reference or merge this with the JSON git
# log, since both are keyed on the commit hash, which is unique.
git log \
--numstat \
--format='%H' \
$@ | \
perl -lawne '
if (defined $F[1]) {
print qq#{"insertions": "$F[0]", "deletions": "$F[1]", "path": "$F[2]"},#
} elsif (defined $F[0]) {
print qq#],\n"$F[0]": [#
};
END{print qq#],#}' | \
tail -n +2 | \
perl -wpe 'BEGIN{print "{"}; END{print "}"}' | \
tr '\n' ' ' | \
perl -wpe 's#(]|}),\s*(]|})#$1$2#g' | \
perl -wpe 's#,\s*?}$#}#'
/*
* OPTIONAL: use this Node.js expression to merge the data structures
* created by the two shell scripts above
*/
var gitLog, lstat;
gitLog = require('git-log.json');
lstat = require('git-stat.json');
gitLog.map(function(o){
o.paths = lstat[o.commit];
});
#!/usr/bin/env bash
# OPTIONAL: Use jq to merge the two JSON files.
jq --slurp '.[1] as $logstat | .[0] | map(.paths = $logstat[.commit])' git-log.json git-stat.json
@sdqali

This comment has been minimized.

Copy link

commented Jun 26, 2012

Thanks @textarcana for this

@CyberShadow

This comment has been minimized.

Copy link

commented Sep 11, 2012

Warning: this doesn't escape special characters, so it will break on any commit message that contains double-quotes or backslashes.

@xero

This comment has been minimized.

Copy link

commented Dec 4, 2012

awesome! thanx!

@dreamyguy

This comment has been minimized.

Copy link

commented Feb 9, 2013

I've been using a similar solution on my gitconfig, and outputting it from terminal with $ git gitlogtojson > ChangeLog.json. It works great, but I really would like to include the output of --shortstat in the same json item (it does print for each item, but outside the json item). Anyone got it done?

@textarcana

This comment has been minimized.

Copy link
Owner Author

commented Nov 3, 2013

@CyberShadow: A quick workaround for escaping special characters in commit messages would be to use %f instead of %s in the format string:

           %f: sanitized subject line, suitable for a filename

I've updated the gist to use %f — thanks for pointing this out!

If you need more powerful escaping then see this answer on StackOverflow.

But if your primary need is to do full-text searching of commit messages (or patches for that matter) then command-line Perl is probably not the right tool. Instead I would take a look at this article on indexing Git logs with Solr.

@textarcana

This comment has been minimized.

Copy link
Owner Author

commented Nov 3, 2013

@dreamyguy I also need --shortstat in JSON format so I've added a shell script for that. The keys for the JSON object are commit hashes so you can easily cross-reference or merge it with a JSON dump of the Git log.

@mbreslow

This comment has been minimized.

Copy link

commented Oct 22, 2014

Thanks

@ip2k

This comment has been minimized.

Copy link

commented Jan 27, 2015

Here's what I use, with sample output:

git log -1 --pretty=format:'----- DEPLOYMENT REPO INFO -----%n{%n  "commit": "%H",%n  "author":     "%an",%n  "author_email": "%ae",%n  "date": "%ad",%n  "message": "%f"%n}'
----- DEPLOYMENT REPO INFO -----
{
  "commit": "2df29ded7842d54d2ccc00903ebb7f5a05e55850",
  "author": "ip2k",
  "author_email": "someone@example.com",
  "date": "Mon Jan 26 11:56:33 2015 -0800",
  "message": "safe-commit-message"
}
@fedesg

This comment has been minimized.

Copy link

commented Nov 11, 2015

thank you man!

@alexBaizeau

This comment has been minimized.

Copy link

commented May 3, 2016

I achieved a good level of double quotes escaping by doing the following

git --no-pager log     --pretty=format:'{%n  111555commit666222: 111555%H666222,%n  111555author666222: 111555%an <%ae>666222,%n  111555date666222: 111555%ad666222,%n  111555message666222: 111555%s %n %b666222},'     $@ | sed 's/"/\\"/g' |  sed 's/111555/"/g' | sed 's/666222/"/g' | perl -pe 'BEGIN{print "["}; END{print "]\n"}' | perl -pe 's/},]/}]/' 

It's not super pretty but it escaped the double quotes in the commit messages.

@dreamyguy

This comment has been minimized.

Copy link

commented May 22, 2016

@textarcana cheers for your reply! Your approach is very interesting, specially because some commits simply don't have stats - something I found out recently. It was the cause why my attempt at parsing the git log with --shortstat failed on a few repos.

I have revisited my 3 years old project determined to get it to be as robust as possible, while rendering the majority of pretty=format: placeholders, and I think I've nailed it: https://github.com/dreamyguy/gitlogg

Some of Gitlogg's features are:

  • Parse the git log of multiple repositories into one JSON file.
  • Introduced repository key/value.
  • Introduced files changed, insertions and deletions keys/values.
  • Introduced impact key/value, which represents the cumulative changes for the commit (insertions - deletions).
  • Sanitise double quotes " by converting them to single quotes ' on values that allow user input, like subject.
  • Nearly all the pretty=format: placeholders are featured.
  • Easily include / exclude which keys/values will be parsed to JSON by commenting out/uncommenting the available ones.
  • Thoroughly commented code.
  • Script execution feedback on console.
  • Error handling (since path to repositories needs to be set correctly).
@textarcana

This comment has been minimized.

Copy link
Owner Author

commented Mar 31, 2017

@dreamyguy gitlogg looks amazing. Will be trying this out at work next week!

@amosshi

This comment has been minimized.

Copy link

commented Aug 4, 2017

Show more fields in the git log

git log \
    --pretty=format:'{%n  "commit-hash": "%H",%n  "abbreviated-commit-hash": "%h",%n  "author-name": "%an",%n  "author-email": "%aE",%n  "author-date": "%aD",%n  "subject": "%s",%n  "sanitized-subject-line": "%f",%n  "body": "%b",%n  "raw-body": "%B",%n  "commit-notes": "%N"%n},' \
    $@ | \
    perl -pe 'BEGIN{print "["}; END{print "]\n"}' | \
perl -pe 's/},]/}]/'  >> log-json.txt

Example result

{
  "commit-hash": "4f7abc00b5690c0b358f365ad5612cda6fbf96e7",
  "abbreviated-commit-hash": "4f7abc0",
  "author-name": "abcde",
  "author-email": "abcde@0ec998aa-f9da-4275-8eea-c4a1938c285e",
  "author-date": "Thu, 24 Nov 2016 08:44:24 +0000",
  "subject": "Unblocking Check-ins via TeamCity caused by fix for ABC",
  "sanitized-subject-line": "Unblocking-Check-ins-via-TeamCity-caused-by-fix-for-ABC",
  "body": "Reverting changes of ABC",
  "raw-body": "Unblocking Check-ins via TeamCity caused by fix for ABC
Reverting changes of DEF
",
  "commit-notes": ""
},
@Biacode

This comment has been minimized.

Copy link

commented Nov 8, 2017

Thanks

@greggman

This comment has been minimized.

Copy link

commented Apr 8, 2018

@amosshi, your example is not valid json. Json does not allow multi-line strings. Line breaks have to be escaped into \n A valid json parser will reject that output. I have no idea if the others have the same issue but yours has an example that's clearly not json

@ORESoftware

This comment has been minimized.

Copy link

commented Apr 16, 2018

this works:

git log --pretty='{"author_email":"%ae"}'

you don't need to do this:

git log --pretty=format:'{"author_email":"%ae"}'
@haydenk

This comment has been minimized.

Copy link

commented May 26, 2019

Why add the new lines to "prettify" the json? You can use something like jq if you want a prettified version to read it better if you want.

git log --pretty=format:'{"commit": "%H","author": "%aN <%aE>","date": "%ad","message": "%f"},'

git log --pretty=format:'{"commit": "%H","author": "%aN <%aE>","date": "%ad","message": "%f"},' | jq .

@damonCY

This comment has been minimized.

Copy link

commented Aug 6, 2019

git log --format='%aE' | sort -u | while read name; do echo -en "$name\t"; git log --author="$name" --pretty=tformat: --numstat | awk '{ add += $1; subs += $2; loc += $1 - $2 } END { printf "added lines: %s, removed lines: %s, total lines: %s\n", add, subs, loc }' -; done

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.