Skip to content

Instantly share code, notes, and snippets.

View walterreade's full-sized avatar

Walter Reade walterreade

View GitHub Profile
@walterreade
walterreade / xgb_hyperopt.py
Created January 16, 2016 23:02
XGBoost Hyperopt Gridsearch
# http://www.dataiku.com/blog/2015/08/24/xgboost_and_dss.html
import dataiku
import pandas as pd, numpy as np
from dataiku import pandasutils as pdu
from sklearn.metrics import roc_auc_score
import xgboost as xgb
from hyperopt import hp, fmin, tpe, STATUS_OK, Trials
train = dataiku.Dataset("train").get_dataframe()
@walterreade
walterreade / sparse_dummies.py
Last active January 16, 2016 22:56
Create Sparse Dummies
# http://www.dataiku.com/blog/2015/08/24/xgboost_and_dss.html
from pandas.core.categorical import Categorical
from scipy.sparse import csr_matrix
import numpy as np
def sparse_dummies(categorical_values):
categories = Categorical.from_array(categorical_values)
N = len(categorical_values)
row_numbers = np.arange(N, dtype=np.int)
ones = np.ones((N,))
@walterreade
walterreade / xgb_aws.txt
Created September 8, 2015 15:31
XGBoost on AWS
sudo apt-get install make
sudo apt-get update
sudo apt-get install gcc
sudo apt-get install g++
sudo apt-get install git
sudo git clone https://github.com/dmlc/xgboost
cd xgboost
./build.sh
cd python-package
python setup.py install
plt.style.use('bmh')
colors = ['#348ABD', '#A60628', '#7A68A6', '#467821', '#D55E00',
'#CC79A7', '#56B4E9', '#009E73', '#F0E442', '#0072B2']
https://github.com/rasbt/matplotlib-gallery
@walterreade
walterreade / kaggle_lb_score.py
Last active August 29, 2015 14:21
Show the progression of a Kaggle leader board distribution over time
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
sns.set()
scores = pd.read_csv('liberty-mutual-group-property-inspection-prediction_public_leaderboard.csv')
scores['SubmissionDate'] = [time.date() for time in scores['SubmissionDate'].astype('datetime64[ns]')]
scores['SubmissionDate'] = [time for time in scores['SubmissionDate'].astype(str)]
@walterreade
walterreade / virtualenv.txt
Last active October 9, 2022 06:41
Setting up standard virtualenv
### for brand-new only
sudo apt-get update
sudo apt-get install htop
sudo apt-get install build-essential
wget https://repo.continuum.io/miniconda/Miniconda3-latest-Linux-x86_64.sh
bash Miniconda3-latest-Linux-x86_64.sh
rm Miniconda3-latest-Linux-x86_64.sh
@walterreade
walterreade / xgb_fscore
Created April 19, 2015 13:43
XGBoost Variable Importance
fscore = [ (v,k) for k,v in clf.get_fscore().iteritems() ]
fscore.sort(reverse=True)
@walterreade
walterreade / gist:2aa0879f140c94b00653
Last active October 20, 2015 01:13
Standard ML Imports
import pandas as pd
pd.set_option('display.mpl_style', 'default')
pd.set_option('display.width', 200)
pd.set_option('display.max_columns', 20)
pd.set_option('display.max_rows', 50)
pd.set_option('precision', 5)
import matplotlib.pyplot as plt
import seaborn as sns
@walterreade
walterreade / DefaultFormula.py
Created May 23, 2013 20:12
Default Radio Button
thisField.setValue(0)