Skip to content

Instantly share code, notes, and snippets.

View dyerrington's full-sized avatar
💭
I may be slow to respond.

David Yerrington dyerrington

💭
I may be slow to respond.
View GitHub Profile
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
@dyerrington
dyerrington / m1_fix_5.1.2.event_loops_py.patch
Last active December 12, 2020 18:50
Jupyter restarting patch fix for Apple M1 Processors.
--- eventloops.py-orig 2020-12-12 10:22:08.633178000 -0800
+++ eventloops.py 2020-12-12 10:22:56.335735100 -0800
@@ -20,7 +20,7 @@
Checks if we are on OS X 10.9 or greater.
"""
- return sys.platform == 'darwin' and V(platform.mac_ver()[0]) >= V('10.9')
+ return sys.platform == 'darwin' and V(platform.mac_ver()[0]) >= V('10.9') and platform.mac_ver()[2] != 'arm64'
@dyerrington
dyerrington / multi-roc_curve.py
Last active September 5, 2020 00:48
I can't tell you how many times I've plotted a roc curve for a multi-class problem from scratch. Too many times. I decided to make this gist to demonstrate how to implement a multi-class ROC (Receiver Operator Characteristic) plot in the most simple manner possible using Python.
## import any sklearn models and collect predictions / probabilities beforehand
import matplotlib.pyplot as plt
from cycler import cycler
## Line color config -- rather than create a structure with a finite color palette, use your own to cycle through a list.
default_cycler = (cycler(color=['r', 'g', 'b', 'y']) +
cycler(linestyle=['-', '--', ':', '-.']))
plt.rc('axes', prop_cycle = default_cycler)
@dyerrington
dyerrington / gist:6850e459c37b3f01e4049505c1634256
Created July 13, 2020 16:42
Debugged fetch_states for Cesar
def fetch_states():
"""
Return all <option> values for the <select> element with all states.
"""
data = {}
states = tree.xpath('//select[@name="state"]')
try:
for state in states[0].xpath('option'):
data[state.attrib['value']] = state.text_content()
except:
import requests, re
def test_station_data_availability(station_id):
for year in range(1960, 2020 + 1):
r = requests.get(f"https://www.ncei.noaa.gov/data/local-climatological-data/access/{year}/")
matches = re.search(r"href=\"([0-9]{6}" +str(station_id) + ".csv)", r.text)
if matches:
print(station_id, " data exists for ", year)
else:
print(station_id, " data not found for ", year)
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
import pandas as pd
import numpy as np
from sklearn.datasets import load_wine
# Load example wine dataset from sklearn
data = load_wine()
# Create a basic DataFrame
df = pd.DataFrame(data['data'], columns = data['feature_names'])
import pandas as pd
df = pd.read_clipboard()
@dyerrington
dyerrington / basic_ols_numpy_example.py
Last active February 7, 2020 19:50
Ordinary least squares implemented with numpy.
import numpy as np
import sys
# lines = input.split("\n")
lines = sys.stdin.readlines()
train_header = lines[0].split()
n_train_features, n_train_observations = int(train_header[0]), int(train_header[1])
training = np.array([row.split() for row in lines[1:n_train_observations]], dtype = float)
X_train, y_train = training[:, 0:n_train_features], training[:, n_train_features:]
@dyerrington
dyerrington / pearson.py
Created February 6, 2020 23:41
Code pearson correlation coefficient from scratch
import pandas as pd
import math
lines = [line.split() for line in input.split("\n") if len(line)]
X, y = [[int(score) for score in scores] for index, (variable, _, *scores) in enumerate(lines)]
n = len(X)
sum_X, sum_y = sum(X), sum(y)
sum_Xy = sum([X[i] * y[i] for i in range(len(X))])