Skip to content

Instantly share code, notes, and snippets.

Last active May 26, 2024 20:25
Show Gist options
  • Save aswalin/9ff5ec849aa24570a87b74558e3d4910 to your computer and use it in GitHub Desktop.
Save aswalin/9ff5ec849aa24570a87b74558e3d4910 to your computer and use it in GitHub Desktop.
import xgboost as xgb
from sklearn import metrics
def auc(m, train, test):
return (metrics.roc_auc_score(y_train,m.predict_proba(train)[:,1]),
# Parameter Tuning
model = xgb.XGBClassifier()
param_dist = {"max_depth": [10,30,50],
"min_child_weight" : [1,3,6],
"n_estimators": [200],
"learning_rate": [0.05, 0.1,0.16],}
grid_search = GridSearchCV(model, param_grid=param_dist, cv = 3,
verbose=10, n_jobs=-1), y_train)
model = xgb.XGBClassifier(max_depth=50, min_child_weight=1, n_estimators=200,\
n_jobs=-1 , verbose=1,learning_rate=0.16),y_train)
auc(model, train, test)
Copy link

Hi Alvira,

Read your awesome post about xgboost/lightgbm/catboost on Medium... coming here hoping to ask you a couple of questions

  1. I simply copied&pasted&ran your code (lightgbm part), and turned out if I ran model2.predict(train), the predictions are real numbers instead of binary numbers. Aren’t we suppose to give binary predictions (delayed or not)?
  2. If real number predictions are intended, is it because you need real numbers to plot AUC? (I’m going to read more about AUC tonight) And why don’t we use binary preditions to plot AUC?

Thank you in advance!


Copy link

Great example of a grid search with custom metric.

Generally, you can use the grid_search.best_estimator_ property to access a fit model directly. No need to re-train a model.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment