Skip to content

Instantly share code, notes, and snippets.

View jennynz's full-sized avatar

Jenny Sahng jennynz

View GitHub Profile
@jennynz
jennynz / review_wait_time_percent_per_year.py
Last active August 24, 2022 01:20
Datavis for how long contributors wait for a review on open source projects
# Given a dataframe called `df` which you've wrangled to have the columns:
# - org (str)
# - year (int or float)
# - bin (str) (e.g. Under 1 day, 1 day to 1 week)
# - percent (float)
# Here's the matplotlib code to generate the datavis in this blog post:
# https://levelup.gitconnected.com/how-does-pr-review-wait-time-affect-your-open-source-project-d79bd0af0ea3
import pandas as pd
import matplotlib.pyplot as plt
@jennynz
jennynz / github_graphql_etl_script.py
Created July 6, 2022 04:51
Python script for getting GitHub data (e.g. PRs, comments, reviews) about a public repository via their GraphQL API, with pagination & rate limit handline
import requests
import boto3
import json
import os
from datetime import datetime, timedelta
# Example: getting data for vuejs/vue from the last 90 days
ORG_NAME = "vuejs"
REPO_NAME = "vue"
@jennynz
jennynz / gharchive_event_types.sql
Created July 5, 2022 05:25
Event types available on GHArchive
CommitCommentEvent
CreateEvent
DeleteEvent
ForkEvent
GollumEvent
IssueCommentEvent
IssuesEvent
MemberEvent
PublicEvent
PullRequestEvent
@jennynz
jennynz / bots.py
Created June 22, 2022 02:05
Filtering out bots from GitHub data
# List of common bots on GitHub
# Doesn't include ones that would already be filtered out by the is_bot function
# but it won't hurt to also include them in here
GITHUB_BOTS = [
'netlify',
'linear-app',
'codeclimate',
'renovate',
'renovate-approve',
@jennynz
jennynz / gharchive_bq_example.sql
Created June 22, 2022 01:56
Query for getting PR and review-related fields from GHArchive on BigQuery
SELECT
repo.name as repo,
type,
created_at,
actor.login,
JSON_VALUE(payload, '$.action') as action,
-- *** PR columns
JSON_VALUE(payload, '$.pull_request.node_id') as pr_node_id,
JSON_VALUE(payload, '$.pull_request.state') as pr_state,
JSON_VALUE(payload, '$.pull_request.user.login') as pr_user_login,
@jennynz
jennynz / harry_potter_all_characters.csv
Last active February 9, 2022 03:23
All 689 Harry Potter characters
We can make this file beautiful and searchable if this error is corrected: No commas found in this CSV file in line 0.
Aberforth Dumbledore
Abernathy
Abraham Peasegood
Abraham Potter
Abraxas Malfoy
Achilles Tolliver
Adalbert Waffling
Adrian Pucey
Adrian Tutley
Agatha Chubb
@jennynz
jennynz / harry_potter_locations.csv
Created February 8, 2022 03:14
List of Harry Potter locations (includes Diagon Alley, Hogsmeade, and others)
We can make this file beautiful and searchable if this error is corrected: No commas found in this CSV file in line 0.
2nd Hand Brooms
Amanuensis Quills
Apothecary
Bluebottle
Broomstix
Cleansweep Broom Company
Comet Brooms Trading Company
De Vine and Daughters Crystal Balls
Dervish and Banges
Diagon Alley Apothecary
@jennynz
jennynz / harry_potter_characters_top_100.csv
Created February 8, 2022 02:59
List of top 100 Harry Potter characters by mention
We can make this file beautiful and searchable if this error is corrected: No commas found in this CSV file in line 0.
Harry Potter
Ron Weasley
Hermione Granger
Albus Dumbledore
Rubeus Hagrid
Severus Snape
Voldemort
Sirius Black
Draco Malfoy
Fred Weasley
@jennynz
jennynz / .zshrc
Last active November 26, 2019 22:13
Powerlevel9K zsh terminal config with pygmentize and git aliases
# If you come from bash you might have to change your $PATH.
# export PATH=$HOME/bin:/usr/local/bin:$PATH
# Path to your oh-my-zsh installation.
export ZSH="/Users/jenny.sahng/.oh-my-zsh"
# Set name of the theme to load --- if set to "random", it will
# load a random theme each time oh-my-zsh is loaded, in which case,
# to know which specific one was loaded, run: echo $RANDOM_THEME
# See https://github.com/robbyrussell/oh-my-zsh/wiki/Themes
@jennynz
jennynz / pretty_prompt.sh
Created July 8, 2019 21:38
A nicer bash prompt with working directory in blue and git branch in yellow.
PS1="\[\033[36m\]\w\[\033[m\]\[\033[32m\] \[\033[33;1m\](\$(git branch 2>/dev/null | grep '^*' | cut -c 3-))\[\033[m\]\] \$ "