Skip to content

Instantly share code, notes, and snippets.

@stasSajin
Last active November 1, 2018 03:43
Show Gist options
  • Save stasSajin/944094b1349f2d775d4dd92854849318 to your computer and use it in GitHub Desktop.
Save stasSajin/944094b1349f2d775d4dd92854849318 to your computer and use it in GitHub Desktop.
import pandas as pd
import numpy as np
import xgboost as xgb
from sklearn.model_selection import train_test_split
data = pd.read_feather('data.feather')
data = pd.get_dummies(data, drop_first=True)
X_train, X_test, y_train, y_test = train_test_split(data.drop('salary', axis = 1),
np.log1p(data.salary.values),
test_size =.30,
random_state=12345)
dtrain = xgb.DMatrix(data = X_train, label=y_train)
dtest = xgb.DMatrix(data = X_test, label=y_test)
param = {'eta':0.1,
'objective':'reg:linear'}
xgb_model = xgb.train(param, dtrain, num_boost_round=100)
X_test['y_hat_with_bias'] = np.exp(xgb_model.predict(dtest)) - 1
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment