We don’t have very many external contributors and, by keeping the repository open source, we are costing the company more money.
To answer this, we’ll need to query the GitHub API:
https://docs.github.com/en/rest
Let’s start by getting the libraries we need for us to be able to call the API via python.
pip3 install PyGithub requests
NOTE: we might not need =PyGithub=
To test, let’s just query for some basic information about me.
import requests
from pprint import pprint
username = "zstix"
url = f"https://api.github.com/users/{username}"
data = requests.get(url).json()
print(data["login"])
print(data["name"])
- zstix
- Zack Stickles
Nice, alright let’s see about getting the pull requests from the repository.
import requests
owner = "newrelic"
repo = "newrelic-quickstarts"
url = f"https://api.github.com/repos/{owner}/{repo}/pulls"
data = requests.get(url).json()
users = [pull["user"]["login"] for pull in data]
return users
- bshankararamaya
- brammerl
Groovy, but this is only the currently open PRs, we want to see all the PRs that have been opened over a specific time period. We’ll need to specify this in our API call.
Also, it’s getting annoying to keep typing stuff over again, so let’s start a session with some helper lines.
import requests
OWNER = "newrelic"
REPO = "newrelic-quickstarts"
BASE_URL = "https://api.github.com"
Okay, so where were we? Oh yeah, let’s get all the closed PRs.
def get_unique_users():
url = f"{BASE_URL}/repos/{OWNER}/{REPO}/pulls?state=closed&per_page=100"
data = requests.get(url).json()
users = list(set([pull["user"]["login"] for pull in data]))
return users
get_unique_users()
- pkudikyala
- MichelLosier
- RamanaReddy8801
- jospdeleon
- gsidhwani-nr
- DarrenDoyle
- sarahkitten
- dependabot[bot]
- JuliaNocera
- jcountsNR
- rahul188
- zstix
- falcon-tech
- mickeyryan42
- aswanson-nr
- stevula
- d3caf
- relic-js
- brammerl
- sjyothi54
- josephgregoryii
- nr-security-github
- sajosam
Now we’re getting somewhere! That said, this isn’t time-bound or anything. Let’s get all the PR authors for the last 6 months. Since the API doesn’t seem to offer a “since” option, let’s just keep getting pages until we hit our target date.
Target date: 2023-11-26
from collections import Counter
def get_recent_unique_users():
url = f"{BASE_URL}/repos/{OWNER}/{REPO}/pulls?state=closed&per_page=100&base=main"
data = requests.get(url).json()
users = [p["user"]["login"] for p in data if p["created_at"] > "2023-11-26"]
return list(Counter(users).items())
get_recent_unique_users()
- (“mickeyryan42” 6)
- (“sarahkitten” 3)
- (“zstix” 2)
- (“brammerl” 7)
- (“d3caf” 7)
- (“aswanson-nr” 4)
- (“MichelLosier” 2)
- (“stevula” 1)
- (“josephgregoryii” 3)
- (“gsidhwani-nr” 1)
- (“caylahamann” 1)
- (“unosios” 1)
- (“sami2ahmed” 1)
- (“jbeveland27” 1)
- (“jcountsNR” 2)
- (“csalvador-nr” 1)
- (“rajut-xrg” 1)
There are only a few names that I don’t immediately recognize as relics (or our partner XRG). Let’s just manually verify those
stevula
(1) works for new relicunosios
(1) has zero public work besides new relic
– I would go on a limb and say this is likely a contractor
sami2ahmed
(1) has some work, not clear if they’re a relic or not
– Seems like he works at Confluent, so this is probably legit
So, of the 44ish PRs that have come in, only **two** might be from external contributors.
For completeness, let’s dig into the two PRs from contributors that I’m unable to verify, to see what happened with those PRs.
def get_specific_pull(login):
url = f"{BASE_URL}/repos/{OWNER}/{REPO}/pulls?state=closed&per_page=100&base=main"
data = requests.get(url).json()
for pull in data:
if pull["user"]["login"] == login:
print(pull["html_url"])
print(pull["title"])
get_specific_pull("sami2ahmed")
unosios
– newrelic/newrelic-quickstarts#2237 – Title: Add dashboard for lifekeeper 20240112 – Made a PR for a dashboard for a quickstat that didn’t exist – PR became stale, no one was able to reach the author, closed – Low quality, would not consider this worth keeping the repo public for
sami2ahmed
– newrelic/newrelic-quickstarts#2169 – Title: fixed dashboard – LB found some issues with it, asked for changes – Then AA, then SK, then it got closed – Another low quality PR in which the author ghosted us and wasted our time
The repository is only merging in code from relics. In the last six months, there were only two pull requests that might have been from external contributors. In both cases, the PRs were low quality and the original author never engaged us in feedback (ultimately resulting in both getting closed due to innactivity).