Skip to content

Instantly share code, notes, and snippets.

@unbracketed
Last active August 3, 2023 18:13
Show Gist options
  • Save unbracketed/3380407 to your computer and use it in GitHub Desktop.
Save unbracketed/3380407 to your computer and use it in GitHub Desktop.
Export Issues from Github repo to CSV (API v3)
"""
Exports Issues from a specified repository to a CSV file
Uses basic authentication (Github username + password) to retrieve Issues
from a repository that username has access to. Supports Github API v3.
"""
import csv
import requests
GITHUB_USER = ''
GITHUB_PASSWORD = ''
REPO = '' # format is username/repo
ISSUES_FOR_REPO_URL = 'https://api.github.com/repos/%s/issues' % REPO
AUTH = (GITHUB_USER, GITHUB_PASSWORD)
def write_issues(response):
"output a list of issues to csv"
if not r.status_code == 200:
raise Exception(r.status_code)
for issue in r.json():
labels = issue['labels']
for label in labels:
if label['name'] == "Client Requested":
csvout.writerow([issue['number'], issue['title'].encode('utf-8'), issue['body'].encode('utf-8'), issue['created_at'], issue['updated_at']])
r = requests.get(ISSUES_FOR_REPO_URL, auth=AUTH)
csvfile = '%s-issues.csv' % (REPO.replace('/', '-'))
csvout = csv.writer(open(csvfile, 'wb'))
csvout.writerow(('id', 'Title', 'Body', 'Created At', 'Updated At'))
write_issues(r)
#more pages? examine the 'link' header returned
if 'link' in r.headers:
pages = dict(
[(rel[6:-1], url[url.index('<')+1:-1]) for url, rel in
[link.split(';') for link in
r.headers['link'].split(',')]])
while 'last' in pages and 'next' in pages:
r = requests.get(pages['next'], auth=AUTH)
write_issues(r)
if pages['next'] == pages['last']:
break
@boombatower
Copy link

Since this is high up in search results... If you're just looking for a json dump (presumably with credentials since public can be done in browser):

#!/bin/bash

repo=$1
filename=$(echo "$repo.json" | tr / -)
echo "Dumping $1 to $filename..."
echo
echo

# remove -u if not private
curl -u "user:pass" \
  "https://api.github.com/repos/$1/issues?per_page=1000&state=all" \
  > $filename

Note you can set per_page to avoid needing to check headers and do multiple requests in most cases.

@cbonilla20
Copy link

Nice @boombatower!

@bojanpopic
Copy link

Thanks @boombatower. However, your script tries to get 1000 items and Github API have max 100 items per page. So, it didn't work for me with 500+ issues. I modified it to use pagination. It's super dirty, but it worked. Here is the gist https://gist.github.com/bojanpopic/2c3025d2952844de1dd0

@awneil
Copy link

awneil commented May 19, 2016

I tried removing the 'Client Requested' - per @Pinwheeler, 24 Jan 2014

And copying the 'dict' part - per @davedyk, 12 Nov 2014

And adding 'state=all' - per @markjd84, 13 Apr 2015

And adding per_page=100 (or 1000) - per @boombatower, 25 Sep 2015, and @bojanpopic, 29 Jan 2016.

But none of it gives me a complete list of all issues.

What am I missing?

@awneil
Copy link

awneil commented May 19, 2016

It looks like the 2nd (and subsequent?) call(s) to write_issues() do not manage to parse the JSON - so they don't find any issues to put into the CSV ?

I did try using the argument - per @js9045,12 Nov 2014 - but that didn't help.

@jithingangadharan
Copy link

jithingangadharan commented Jun 12, 2016

Thank you for sharing this code and how can i retrieve issues from private Repository ?

@Billy-
Copy link

Billy- commented Sep 21, 2016

Hi All. I was running this script (thanks @unbracketed) and found that it would stop writing the data at an odd point. I found that it was because the csv file was not being .close()'d. I have fixed that and also made another couple of changes:

  • It also outputs the response json into a file (this was mainly for debugging, and seeing what data is available to me). This feature is not fully working, each request appends a new json object to the file, which makes the file invalid json. As I say it was just for debugging, so this was not a problem for me
  • It only writes rows which are issues (as opposed to pull requests)
  • It appends a total number of issues on the last row
  • Prints some useful information as it processes the issues

See my fork here

@Kebiled
Copy link

Kebiled commented Sep 26, 2016

I expanded on @billy's fork by adding @mblackstock's solution to ensure the while loop breaks and stops reiterating over the second page.

I also added a file called export_multi_repo_issues_to_csv.py which contains a repository list so you can export issues from multiple repositories into seperate csv files.

Here's my fork.

@ukreddy-erwin
Copy link

Can anyone suggest how to export issues from GitHub Enterprise to Public GitHub

@Jammizzle
Copy link

Jammizzle commented Nov 2, 2016

I expanded on @Kebiled's fork by adding the ability to also include ZenHub's API to include 'Estimate Value' and 'Pipeline' and also include a list of Labels and assignees. I'm not sure how many people use ZenHub but the fork is here if anyone does end up wanting to use it .
Might seem a bit funny making one request per issue for the ZenHub API, that's just the

@patrickfuller
Copy link

patrickfuller commented Nov 8, 2016

Yet another fork here.

  • Prompts for username/pass instead of raw text
  • Repositories passed as arguments
  • Python3 / PEP8

Usage: python github_to_csv.py --help.

@axd1967
Copy link

axd1967 commented Dec 22, 2016

(off-topic: this is about gists)

it's striking how much changes are proposed, but how few actually end up somewhere in the associated Git repo... users here are apparently supposed to type in those changes by hand? I guess... maybe Git could help?

I'm a gist n00b, but I can't understand why all the comments suggesting code changes are not accessible as e.g. branches or SHA-1 references (this would require commenters to start by forking, then applying changes, then sharing those changes!).

Just to have an idea I took the trouble to clone this gist, and added a handful of forks as remotes.

fks

for example, @davedyk (I just picked a random commenter) proposes changes, but didn't fork...

(if anyone knows of places where Gist issues are discussed, let me know: a bit similar to https://github.com/dear-github/dear-github)

@mfunk
Copy link

mfunk commented Mar 1, 2017

Thank you Brian! This snippet helped make writing up known issues for release notes sooooo much easier.

@marcelkornblum
Copy link

marcelkornblum commented Mar 8, 2017

https://gist.github.com/marcelkornblum/21be3c13b2271d1d5a89bf08cbfa500e

Another fork if it's useful to anyone.

The basic functionality is the same, but reorganised into clearer methods. I've added the various snippets people suggested in the early comments, meaning

  • you can use username/pass or token auth
  • set filters for results (including for labels which is more efficient than the original approach)
  • I added labels to the CSV output
  • Pagination is more clearly handled

Tested on python 2.7

Hope this is useful to someone and thanks @unbracketed

@sshaw
Copy link

sshaw commented Jul 23, 2017

Here's something else (in Ruby) to export pull requests and issues to a CSV file. Supports GitLab and Bitbucket too: https://github.com/sshaw/export-pull-requests

@abevoelker
Copy link

This script comes up high in Google results for certain queries but it's pretty limited in that it only exports the initial issue, not issue comments.

My goal was to backup GitHub data for an organization, and this project worked a lot better for that purpose: https://github.com/josegonzalez/python-github-backup It also lets you back up issue comments, issue events, PRs, PR review comments, wikis, etc.

@Vravi123
Copy link

Vravi123 commented Sep 4, 2017

Hi ,
I am trying to export zenhub issues to csv and using the below code

REPO = ''
url = "https://github.ibm.com/Webtrans/EOSD-ISA-LocalApps/issues/json?issues=%s" %(REPO)

response = requests.get(url,auth=AUTH)
response.json() --- here i am getting the below error :
JSONDecodeError Traceback (most recent call last)
in ()
----> 1 response.json()

C:\Anaconda3\lib\site-packages\requests\models.py in json(self, **kwargs)
883 # used.
884 pass
--> 885 return complexjson.loads(self.text, **kwargs)
886
887 @Property

C:\Anaconda3\lib\json_init_.py in loads(s, encoding, cls, object_hook, parse_float, parse_int, parse_constant, object_pairs_hook, **kw)
352 parse_int is None and parse_float is None and
353 parse_constant is None and object_pairs_hook is None and not kw):
--> 354 return _default_decoder.decode(s)
355 if cls is None:
356 cls = JSONDecoder

C:\Anaconda3\lib\json\decoder.py in decode(self, s, _w)
337
338 """
--> 339 obj, end = self.raw_decode(s, idx=_w(s, 0).end())
340 end = _w(s, end).end()
341 if end != len(s):

C:\Anaconda3\lib\json\decoder.py in raw_decode(self, s, idx)
355 obj, end = self.scan_once(s, idx)
356 except StopIteration as err:
--> 357 raise JSONDecodeError("Expecting value", s, err.value) from None
358 return obj, end

JSONDecodeError: Expecting value: line 1 column 1 (char 0)

can any one pls help on this

@jschristie
Copy link

Hello all,

First Thanks for the code, while i try to run this code, i get the below error, could someone please tell me why i get this error.

Traceback (most recent call last):
File "export-issues.py", line 33, in
write_issues(r)
File "export-issues.py", line 21, in write_issues
raise Exception(r.status_code)
Exception: 401

And also, i use python 3.5 and python 3.6 and i get the same error.

and i use python export-issues.py command in Command prompt

Any help on this would be great

@amacfie
Copy link

amacfie commented Oct 26, 2017

@jschristie usually that means an incorrect password

@simard57
Copy link

simard57 commented Nov 9, 2017

I am running on Windows 7 machine.
the import requests (line 7) is reporting the module is not found! I just installed Python from python.org -- is there a library I need to get as well?

[update]
I found instructions for requests.py @ http://docs.python-requests.org/en/master/user/install/#install and installed it. I then ran
python.exe getIssues.py and got

Traceback (most recent call last):
File "getIssues.py", line 30, in
csvout.writerow(('id', 'Title', 'Body', 'Created At', 'Updated At'))
TypeError: a bytes-like object is required, not 'str'

@damithc
Copy link

damithc commented Dec 2, 2017

Traceback (most recent call last):
File "getIssues.py", line 30, in
csvout.writerow(('id', 'Title', 'Body', 'Created At', 'Updated At'))
TypeError: a bytes-like object is required, not 'str'

@simard57 I ran into the same problem. I suspect it is an incompatibility between python 2 and 3.
Try using this (worked for me),

csvout = csv.writer(open(csvfile, 'w', newline=''))

instead of this:

csvout = csv.writer(open(csvfile, 'wb'))

@PatMcCarthy
Copy link

WRT the script github_to_csv.py, and others here....
So, BLEEDING EDGE NEWBIE here (I can code in everything from COBOL to C#, But today is my first attempt at Python)
Download to windows & Install - smooooth
Copied Python script and ran it.... ummm...
I am getting kicked due to
import requests
ModuleNotFoundError: No module named 'requests'
So..... where can I find this module???
In case it helps: Python 3.6.4 (v3.6.4:d48eceb, Dec 19 2017, 06:04:45) [MSC v.1900 32 bit (Intel)] on win32

@PatMcCarthy
Copy link

PatMcCarthy commented Jan 2, 2018

Update... Found Requests.
Wish to add it....
Found Install dox...
To install Requests, simply:
.. code-block:: bash
$ pip install requests
@^%^%$!*)@^
Satisfaction guaranteed.

So...

  1. Install Python
  2. Run the following: python -m pip install requests
  3. Run the script, as described above
  4. at this point, YMMV....

@kirankodali
Copy link

I am getting the below error, please advise what might be the issue

Traceback (most recent call last):
File git_issues.py", line 31, in
write_issues(r)
File git_issues.py", line 19, in write_issues
raise Exception(r.status_code)
Exception: 404

Process finished with exit code 1

@Craigfis
Copy link

Doesn't work with two-factor auth. I ended up just using curl.

@DavidMCook
Copy link

@gavinr
Copy link

gavinr commented Apr 19, 2020

This is a good python script. Thanks for posting it. Here is this concept wrapped in a CLI tool:
https://github.com/gavinr/github-csv-tools

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment