Get the metadata and contents of all files in a target Github repo using GraphQL
Github V4 GraphQL API
I included some useful attributes about files in a Github repo and covered them in the query file. This can be modified to work with any repo you have read access to.
The output includes:
blob
- a text or binary file.tree
- directory path which has no content.
mode
field- Usually 16384 or 33188
text
field- From the schema: "UTF8 text data or null if the Blob is binary".
- Includes "\n" for line breaks. Note your code might have "\n" characters in it too.
isBinary
field- Useful if you want to separate file types or not try and count lines in a binary.
- Binary might be images or compiled files.
- Some extra fields are included but commented out - use them if you want.
- I could not summary for number of files or find number of lines so you have to work it out yourself.
- I don't know what else could be used for
expression
but the docs say "A Git revision expression suitable for rev-parse".
Try the query out in the explorer.
- Go to the explorer and sign in.
- Paste the GQL query in the main pane.
- Paste sample JSON into the query variables pane
- Press the play button to run.
Using curl
or a library in Python, Ruby, etc.
Simplified JSON output
{
"entries": [
{
"name": ".gitignore",
"type": "blob",
"object": {
"byteSize": 32,
"text": "node_modules/\npackage-lock.json\n"
}
},
{
"name": ".vscode",
"type": "tree",
"object": {}
},
{
"name": "CONTRIBUTING.md",
"type": "blob",
"object": {
"byteSize": 1520,
"text": "..."
}
}
]
}
- Thanks to this gist for getting me going with using the
Tree
andBlob
structure. - Intro to GraphQL
- Github
- Github V4 GraphQL docs.
- Forming calls
- Target URL for query: api.github.com/graphql
- GraphQL guide that I wrote.