Skip to content

Instantly share code, notes, and snippets.

@sandys
Created August 18, 2019 16:26
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save sandys/d0acf13976bae81df253d0a09436cb2b to your computer and use it in GitHub Desktop.
Save sandys/d0acf13976bae81df253d0a09436cb2b to your computer and use it in GitHub Desktop.
Gridsearchcv with k-fold cross validation and early stopping
#
... import xgboost.sklearn as xgb
... from sklearn.model_selection import GridSearchCV
... from sklearn.model_selection import TimeSeriesSplit
...
... cv = 2
...
... trainX= [[1], [2], [3], [4], [5]]
... trainY = [1, 2, 1, 2, 1]
...
... # these are the evaluation sets
... testX = trainX
... testY = trainY
...
... paramGrid = {"subsample" : [0.5, 0.8]}
...
... fit_params={"early_stopping_rounds":42,
... "eval_metric" : "mae",
... "eval_set" : [[testX, testY]]}
...
... model = xgb.XGBRegressor()
...
... from sklearn.model_selection import StratifiedKFold
... skf = StratifiedKFold(n_splits=cv, shuffle = True, random_state = 999)
... gridsearch = GridSearchCV(model, paramGrid, verbose=1,
... cv=skf.split(trainX, trainY))
...
... g = gridsearch.fit(trainX, trainY, **fit_params)
... y_true, y_pred = testY, gridsearch.predict(testX)
... from sklearn.metrics import r2_score
... print(r2_score(y_true, y_pred))
... print(g.cv_results_)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment