Skip to content

Instantly share code, notes, and snippets.

@zstix
Last active May 28, 2024 16:44
Show Gist options
  • Save zstix/02e7170a1bc0209d79be69579209b2ee to your computer and use it in GitHub Desktop.
Save zstix/02e7170a1bc0209d79be69579209b2ee to your computer and use it in GitHub Desktop.

Hypothesis

We don’t have very many external contributors and, by keeping the repository open source, we are costing the company more money.

Question: Who is contributing to the repository?

To answer this, we’ll need to query the GitHub API:

https://docs.github.com/en/rest

Let’s start by getting the libraries we need for us to be able to call the API via python.

pip3 install PyGithub requests

NOTE: we might not need =PyGithub=

To test, let’s just query for some basic information about me.

import requests
from pprint import pprint

username = "zstix"
url = f"https://api.github.com/users/{username}"
data = requests.get(url).json()

print(data["login"])
print(data["name"])
  • zstix
  • Zack Stickles

Nice, alright let’s see about getting the pull requests from the repository.

import requests

owner = "newrelic"
repo = "newrelic-quickstarts"
url = f"https://api.github.com/repos/{owner}/{repo}/pulls"
data = requests.get(url).json()
users = [pull["user"]["login"] for pull in data]

return users
  • bshankararamaya
  • brammerl

Groovy, but this is only the currently open PRs, we want to see all the PRs that have been opened over a specific time period. We’ll need to specify this in our API call.

Also, it’s getting annoying to keep typing stuff over again, so let’s start a session with some helper lines.

import requests

OWNER = "newrelic"
REPO = "newrelic-quickstarts"
BASE_URL = "https://api.github.com"

Okay, so where were we? Oh yeah, let’s get all the closed PRs.

def get_unique_users():
  url = f"{BASE_URL}/repos/{OWNER}/{REPO}/pulls?state=closed&per_page=100"
  data = requests.get(url).json()
  users = list(set([pull["user"]["login"] for pull in data]))
  return users

get_unique_users()
  • pkudikyala
  • MichelLosier
  • RamanaReddy8801
  • jospdeleon
  • gsidhwani-nr
  • DarrenDoyle
  • sarahkitten
  • dependabot[bot]
  • JuliaNocera
  • jcountsNR
  • rahul188
  • zstix
  • falcon-tech
  • mickeyryan42
  • aswanson-nr
  • stevula
  • d3caf
  • relic-js
  • brammerl
  • sjyothi54
  • josephgregoryii
  • nr-security-github
  • sajosam

Now we’re getting somewhere! That said, this isn’t time-bound or anything. Let’s get all the PR authors for the last 6 months. Since the API doesn’t seem to offer a “since” option, let’s just keep getting pages until we hit our target date.

Target date: 2023-11-26

from collections import Counter

def get_recent_unique_users():
  url = f"{BASE_URL}/repos/{OWNER}/{REPO}/pulls?state=closed&per_page=100&base=main"
  data = requests.get(url).json()
  users = [p["user"]["login"] for p in data if p["created_at"] > "2023-11-26"]
  return list(Counter(users).items())

get_recent_unique_users()
  • (“mickeyryan42” 6)
  • (“sarahkitten” 3)
  • (“zstix” 2)
  • (“brammerl” 7)
  • (“d3caf” 7)
  • (“aswanson-nr” 4)
  • (“MichelLosier” 2)
  • (“stevula” 1)
  • (“josephgregoryii” 3)
  • (“gsidhwani-nr” 1)
  • (“caylahamann” 1)
  • (“unosios” 1)
  • (“sami2ahmed” 1)
  • (“jbeveland27” 1)
  • (“jcountsNR” 2)
  • (“csalvador-nr” 1)
  • (“rajut-xrg” 1)

There are only a few names that I don’t immediately recognize as relics (or our partner XRG). Let’s just manually verify those

  • stevula (1) works for new relic
  • unosios (1) has zero public work besides new relic

– I would go on a limb and say this is likely a contractor

  • sami2ahmed (1) has some work, not clear if they’re a relic or not

– Seems like he works at Confluent, so this is probably legit

So, of the 44ish PRs that have come in, only **two** might be from external contributors.

For completeness, let’s dig into the two PRs from contributors that I’m unable to verify, to see what happened with those PRs.

def get_specific_pull(login):
  url = f"{BASE_URL}/repos/{OWNER}/{REPO}/pulls?state=closed&per_page=100&base=main"
  data = requests.get(url).json()
  for pull in data:
    if pull["user"]["login"] == login:
      print(pull["html_url"])
      print(pull["title"])

get_specific_pull("sami2ahmed")
  • unosios

newrelic/newrelic-quickstarts#2237 – Title: Add dashboard for lifekeeper 20240112 – Made a PR for a dashboard for a quickstat that didn’t exist – PR became stale, no one was able to reach the author, closed – Low quality, would not consider this worth keeping the repo public for

  • sami2ahmed

newrelic/newrelic-quickstarts#2169 – Title: fixed dashboard – LB found some issues with it, asked for changes – Then AA, then SK, then it got closed – Another low quality PR in which the author ghosted us and wasted our time

Answer

The repository is only merging in code from relics. In the last six months, there were only two pull requests that might have been from external contributors. In both cases, the PRs were low quality and the original author never engaged us in feedback (ultimately resulting in both getting closed due to innactivity).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment