Skip to content

Instantly share code, notes, and snippets.

@mrVanDalo
Last active January 2, 2021 01:24
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save mrVanDalo/6a1d1aed4bd613fbdf1fa751fca47c6a to your computer and use it in GitHub Desktop.
Save mrVanDalo/6a1d1aed4bd613fbdf1fa751fca47c6a to your computer and use it in GitHub Desktop.
gitlog2json
from git import Repo
import os
import json
import click
class GitLogger:
"""to provide a log as dict of commits which are json printable"""
def __init__(self, path):
"""Create a GitStepper with the path to the git repository (not a bare repository)"""
self.repo = Repo(path)
def log(self):
"""return a dict of commits"""
commits = (self.repo.commit(logEntry) for logEntry in self.repo.iter_commits())
return (self.to_dict(x) for x in commits)
def to_dict(self,commit):
"""create a dict out of a commit that is easy to json serialize"""
return {
"author_email" : commit.author.email,
"author_name" : commit.author.name,
"authored_date" : commit.authored_datetime.isoformat(),
"changes": commit.stats.files,
"committed_date" : commit.committed_datetime.isoformat(),
"committer_email" : commit.committer.email,
"committer_name" : commit.committer.name,
"encoding" : commit.encoding,
"hash" : commit.hexsha,
"message" : commit.message ,
"summary" : commit.summary,
"size" : commit.size,
"stats_total" : commit.stats.total,
"parents" : [parent.hexsha for parent in commit.parents],
}
@click.command()
@click.argument("path", type=click.Path(exists=True), envvar='PWD')
def main(path):
for entry in GitLogger(path).log():
print(json.dumps(entry))
if __name__ == '__main__':
main()
@khalidelhaji
Copy link

Dope!

@nilleb
Copy link

nilleb commented Jan 18, 2020

huge judos to you!

@butterl
Copy link

butterl commented Dec 29, 2020

@mrVanDalo
Any idea if GitLogger(path).log() could passing args like --no-merges and rev1..rev2 ?

@mrVanDalo
Copy link
Author

mrVanDalo commented Dec 30, 2020

@butterl I guess so. I think you'll have to pass theses arguments to the self.repo during the iter_commits() call. Documentation is here

@butterl
Copy link

butterl commented Dec 31, 2020

Thanks, I format the kwargs and rev to make proper param, that works. from begining I do not understand kwargs meaning

By the way ,I modified your example for multi-repo use case, the codeline
commits = (self.repo.commit(logEntry) for logEntry in self.repo.iter_commits())
may failed to catch exception for error handel(I guess lambda made exception not catchable) , if gitpython api raised one
In my case the git rev-list may face rev not exist issue in multi repo , I modified it to list way to handle exception

@mrVanDalo
Copy link
Author

Thanks, I format the kwargs and rev to make proper param, that works. from begining I do not understand kwargs meaning

Yeah I'm no python expert, I actually barley use it. I would just try and error my way through the kwargs.

By the way ,I modified your example for multi-repo use case, the codeline
commits = (self.repo.commit(logEntry) for logEntry in self.repo.iter_commits())
may failed to catch exception for error handel(I guess lambda made exception not catchable) , if gitpython api raised one
In my case the git rev-list may face rev not exist issue in multi repo , I modified it to list way to handle exception

Sounds reasonable. I used the (.. for key in ... ) form because the documentation says it's lazy. I was working with quite big repositories. Creating a list upfront was consuming to much RAM (in my cases). But if it works for you, all good :D

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment