Skip to content

Instantly share code, notes, and snippets.

@elliott-wen
Last active September 16, 2023 00:31
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save elliott-wen/12519540b0ad684b4a85a0a7684c4ce9 to your computer and use it in GitHub Desktop.
Save elliott-wen/12519540b0ad684b4a85a0a7684c4ce9 to your computer and use it in GitHub Desktop.
How to use GHFeed
import gzip
import json
import requests
# Download the log file
resp = requests.get("https://www.ghfeed.org/data/2023-09-16-10.json.gz")
open('data.gz', 'wb').write(resp.content)
# Parse the log file
with gzip.open('data.gz', 'rt') as fp:
# Load each commit event
for line in fp:
commit = json.loads(line)
repo_name = commit['repo']
# Iterate each newly-upload file
files = commit['files']
for file in files:
sha = file['sha']
# To get the file content, use GitHub api
content_url = "https://api.github.com/repos/%s/git/blobs/%s" % (repo_name, sha)
blob_resp = requests.get(content_url).json()
blob_content = blob_resp['content']
# Do whatever you like to the content
# ...
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment