-
-
Save textarcana/1306223 to your computer and use it in GitHub Desktop.
#!/usr/bin/env bash | |
# Use this one-liner to produce a JSON literal from the Git log: | |
git log \ | |
--pretty=format:'{%n "commit": "%H",%n "author": "%aN <%aE>",%n "date": "%ad",%n "message": "%f"%n},' \ | |
$@ | \ | |
perl -pe 'BEGIN{print "["}; END{print "]\n"}' | \ | |
perl -pe 's/},]/}]/' |
#!/usr/bin/env bash | |
# OPTIONAL: use this stand-alone shell script to produce a JSON object | |
# with information similar to git --stat. | |
# | |
# You can then easily cross-reference or merge this with the JSON git | |
# log, since both are keyed on the commit hash, which is unique. | |
git log \ | |
--numstat \ | |
--format='%H' \ | |
$@ | \ | |
perl -lawne ' | |
if (defined $F[1]) { | |
print qq#{"insertions": "$F[0]", "deletions": "$F[1]", "path": "$F[2]"},# | |
} elsif (defined $F[0]) { | |
print qq#],\n"$F[0]": [# | |
}; | |
END{print qq#],#}' | \ | |
tail -n +2 | \ | |
perl -wpe 'BEGIN{print "{"}; END{print "}"}' | \ | |
tr '\n' ' ' | \ | |
perl -wpe 's#(]|}),\s*(]|})#$1$2#g' | \ | |
perl -wpe 's#,\s*?}$#}#' |
/* | |
* OPTIONAL: use this Node.js expression to merge the data structures | |
* created by the two shell scripts above | |
*/ | |
var gitLog, lstat; | |
gitLog = require('git-log.json'); | |
lstat = require('git-stat.json'); | |
gitLog.map(function(o){ | |
o.paths = lstat[o.commit]; | |
}); |
#!/usr/bin/env bash | |
# OPTIONAL: Use jq to merge the two JSON files. | |
jq --slurp '.[1] as $logstat | .[0] | map(.paths = $logstat[.commit])' git-log.json git-stat.json |
@textarcana cheers for your reply! Your approach is very interesting, specially because some commits simply don't have stats - something I found out recently. It was the cause why my attempt at parsing the git log
with --shortstat
failed on a few repos.
I have revisited my 3 years old project determined to get it to be as robust as possible, while rendering the majority of pretty=format:
placeholders, and I think I've nailed it: https://github.com/dreamyguy/gitlogg
Some of Gitlogg's features are:
- Parse the
git log
of multiple repositories into oneJSON
file. ✨ - Introduced
repository
key/value. - Introduced
files changed
,insertions
anddeletions
keys/values. - Introduced
impact
key/value, which represents the cumulative changes for the commit (insertions
-deletions
). - Sanitise double quotes
"
by converting them to single quotes'
on values that allow user input, likesubject
. - Nearly all the
pretty=format:
placeholders are featured. - Easily include / exclude which keys/values will be parsed to
JSON
by commenting out/uncommenting the available ones. - Thoroughly commented code.
- Script execution feedback on console.
- Error handling (since path to repositories needs to be set correctly).
@dreamyguy gitlogg looks amazing. Will be trying this out at work next week!
Show more fields in the git log
git log \ --pretty=format:'{%n "commit-hash": "%H",%n "abbreviated-commit-hash": "%h",%n "author-name": "%an",%n "author-email": "%aE",%n "author-date": "%aD",%n "subject": "%s",%n "sanitized-subject-line": "%f",%n "body": "%b",%n "raw-body": "%B",%n "commit-notes": "%N"%n},' \ $@ | \ perl -pe 'BEGIN{print "["}; END{print "]\n"}' | \ perl -pe 's/},]/}]/' >> log-json.txt
Example result
{ "commit-hash": "4f7abc00b5690c0b358f365ad5612cda6fbf96e7", "abbreviated-commit-hash": "4f7abc0", "author-name": "abcde", "author-email": "abcde@0ec998aa-f9da-4275-8eea-c4a1938c285e", "author-date": "Thu, 24 Nov 2016 08:44:24 +0000", "subject": "Unblocking Check-ins via TeamCity caused by fix for ABC", "sanitized-subject-line": "Unblocking-Check-ins-via-TeamCity-caused-by-fix-for-ABC", "body": "Reverting changes of ABC", "raw-body": "Unblocking Check-ins via TeamCity caused by fix for ABC Reverting changes of DEF ", "commit-notes": "" },
Thanks
@amosshi, your example is not valid json. Json does not allow multi-line strings. Line breaks have to be escaped into \n A valid json parser will reject that output. I have no idea if the others have the same issue but yours has an example that's clearly not json
this works:
git log --pretty='{"author_email":"%ae"}'
you don't need to do this:
git log --pretty=format:'{"author_email":"%ae"}'
Why add the new lines to "prettify" the json? You can use something like jq if you want a prettified version to read it better if you want.
git log --pretty=format:'{"commit": "%H","author": "%aN <%aE>","date": "%ad","message": "%f"},'
git log --pretty=format:'{"commit": "%H","author": "%aN <%aE>","date": "%ad","message": "%f"},' | jq .
git log --format='%aE' | sort -u | while read name; do echo -en "$name\t"; git log --author="$name" --pretty=tformat: --numstat | awk '{ add += $1; subs += $2; loc += $1 - $2 } END { printf "added lines: %s, removed lines: %s, total lines: %s\n", add, subs, loc }' -; done
I just wanted to say thanks to all of you who have commented on this gist with improvements in recent years! I'm so happy that this script is still inspiring people to advanced the cause of structured SCM log analytics!
You can use something like jq
Heh. jq 1.0 release 2015, this script first draft was 2011.
But yes, today I would implement things somewhat differently using jq 😀
There is now a Git repo for the code in this gist, which even in 2020 I think is still relevant (despite there now being several other solutions to choose as well).
If you have commented above with regard to code changes, you could now move your comment into a full-on pull request and I would certainly consider all improvements!
https://github.com/context-driven-testing-toolkit/git-log2json
Show more fields in the git log
git log
--pretty=format:'{%n "commit-hash": "%H",%n "abbreviated-commit-hash": "%h",%n "author-name": "%an",%n "author-email": "%aE",%n "author-date": "%aD",%n "subject": "%s",%n "sanitized-subject-line": "%f",%n "body": "%b",%n "raw-body": "%B",%n "commit-notes": "%N"%n},'
$@ |
perl -pe 'BEGIN{print "["}; END{print "]\n"}' |
perl -pe 's/},]/}]/' >> log-json.txt
Example result{
"commit-hash": "4f7abc00b5690c0b358f365ad5612cda6fbf96e7",
"abbreviated-commit-hash": "4f7abc0",
"author-name": "abcde",
"author-email": "abcde@0ec998aa-f9da-4275-8eea-c4a1938c285e",
"author-date": "Thu, 24 Nov 2016 08:44:24 +0000",
"subject": "Unblocking Check-ins via TeamCity caused by fix for ABC",
"sanitized-subject-line": "Unblocking-Check-ins-via-TeamCity-caused-by-fix-for-ABC",
"body": "Reverting changes of ABC",
"raw-body": "Unblocking Check-ins via TeamCity caused by fix for ABC
Reverting changes of DEF
",
"commit-notes": ""
},
How would I show the files in this as well?
I achieved a good level of double quotes escaping by doing the following
git --no-pager log --pretty=format:'{%n 111555commit666222: 111555%H666222,%n 111555author666222: 111555%an <%ae>666222,%n 111555date666222: 111555%ad666222,%n 111555message666222: 111555%s %n %b666222},' $@ | sed 's/"/\\"/g' | sed 's/111555/"/g' | sed 's/666222/"/g' | perl -pe 'BEGIN{print "["}; END{print "]\n"}' | perl -pe 's/},]/}]/'
It's not super pretty but it escaped the double quotes in the commit messages.
I'm not sure why different special character sequences are used for "opening double quote" and "closing double quote". I don't think that provides any benefit. Just use 111555
(or whatever) for both.
(I know the original post I'm commenting on is very old, but it and my comment on it are both still relevant today.)
You can use jq
itself to escape everything:
while IFS=$'\t' read -d '' -r -u 3 commit_hash author_date subject body; do
jq --null-input \
--arg commit_hash "$commit_hash" \
--arg author_date "$author_date" \
--arg subject "$subject" \
--arg body "$body" \
'{
"commit_hash":(if $commit_hash == "" then null else $commit_hash end),
"author_date":(if $author_date == "" then null else $author_date end),
"subject":(if $subject == "" then null else $subject end),
"body":(if $body == "" then null else $body end),
}'
done 3< <(git log --expand-tabs --pretty='tformat:%H%x09%aI%x09%s%x09%b' -z)
It's a bit clunky, but should handle any weird formatting. Basically, format the log using NUL-terminated lines with tab column separators, then pass that into jq
for escaping, treating all empty strings as null
.
I achieved a good level of double quotes escaping by doing the following
It's not super pretty but it escaped the double quotes in the commit messages.