Skip to content

Instantly share code, notes, and snippets.

View jnothman's full-sized avatar

Joel Nothman jnothman

  • Canva
  • Sydney
View GitHub Profile
@jnothman
jnothman / get-pr-modified-lines.sh
Created June 21, 2018 11:39
Identify which lines in master are modified by open pull requests
#!/bin/bash
# requires curl, jq, python
# set these variables
repo=scikit-learn/scikit-learn
remote=upstream
token= # GitHub personal access token goes here
list_open_prs() {
@jnothman
jnothman / memoryfit.py
Created March 6, 2018 23:00
Scikit-learn: Cache an estimator's fit with a mixin
from sklearn.externals import joblib
from sklearn.linear_model import LogisticRegression
from sklearn.feature_selection import RFE
from sklearn.model_selection import GridSearchCV
from sklearn.datasets import make_classification
memory = joblib.Memory('/tmp')
elapsed n_features n_samples_Y working_memory
0 37.13675403594971 1000 100 1
1 36.132298946380615 1000 100 1
2 38.42840266227722 1000 100 1
3 37.29706525802612 1000 1000 1
4 34.58624744415283 1000 1000 1
5 36.96818971633911 1000 1000 1
6 38.077014684677124 1000 5000 1
7 36.49429655075073 1000 5000 1
8 39.22079372406006 1000 5000 1
elapsed n_features n_samples working_memory
0 259.7408814430237 100 40000 1
1 263.6345293521881 100 40000 1
2 263.4755172729492 100 40000 1
3 32.214531898498535 100 20000 1
4 32.38340449333191 100 20000 1
5 32.6368203163147 100 20000 1
6 4.7826080322265625 100 10000 1
7 4.722550392150879 100 10000 1
8 4.735676050186157 100 10000 1
@jnothman
jnothman / plotview.py
Last active December 6, 2017 22:07
Generic Django plot view for matplotlib plot rendering and serving
from django.views import View
from django.http import HttpResponse
MIMETYPES = {
'ps': 'application/postscript',
'eps': 'application/postscript',
'pdf': 'application/pdf',
'svg': 'image/svg+xml',
'png': 'image/png',
'jpeg': 'image/jpeg',
@jnothman
jnothman / auspoliticians-wikidata.rq
Created October 23, 2017 06:05
Get Australian/COAG parliamentarians' Positions, Party Memberships and Awards from WikiData.
SELECT
?subj
?subjLabel
?prop
?position
?positionLabel
?start
?end
?district
?districtLabel
@jnothman
jnothman / deprecdict.py
Last active October 10, 2017 05:31
A dict which raises a warning when some keys are looked up
import warnings
from sklearn.utils.testing import assert_warns_message, assert_no_warnings
class DeprecationDict(dict):
"""A dict which raises a warning when some keys are looked up
Note, this does not raise a warning for __contains__ and iteration.
It also will raise a warning even after the key has been manually set by
@jnothman
jnothman / check_multilabel_output_shapes.py
Created August 23, 2017 03:07
multilabel decision_function and predict_proba output shapes
import warnings
import sklearn
warnings.simplefilter('ignore')
from sklearn import *
X, y = datasets.make_multilabel_classification()
for clf in [tree.DecisionTreeClassifier(),
neighbors.KNeighborsClassifier(),
neural_network.MLPClassifier(),
multioutput.MultiOutputClassifier(linear_model.LogisticRegression()),
@jnothman
jnothman / check_multilabel_output_shapes.py
Created August 23, 2017 03:07
multilabel decision_function and predict_proba output shapes
import warnings
import sklearn
warnings.simplefilter('ignore')
from sklearn import *
X, y = datasets.make_multilabel_classification()
for clf in [tree.DecisionTreeClassifier(),
neighbors.KNeighborsClassifier(),
neural_network.MLPClassifier(),
multioutput.MultiOutputClassifier(linear_model.LogisticRegression()),
@jnothman
jnothman / pandasvectorizer.py
Last active August 21, 2017 03:00
vectorize a pandas dataframe with scikit-learn <= 0.19
from sklearn.feature_extraction import DictVectorizer
class PandasVectorizer(DictVectorizer):
def fit(self, x, y=None):
return super(PandasVectorizer, self).fit(x.to_dict('records'))
def fit_transform(self, x, y=None):
return super(PandasVectorizer, self).fit_transform(x.to_dict('records'))
def transform(self, x):