Skip to content

Instantly share code, notes, and snippets.

Show Gist options
  • Save victormurcia/785fde9613d00edbf63f8872c97c82ea to your computer and use it in GitHub Desktop.
Save victormurcia/785fde9613d00edbf63f8872c97c82ea to your computer and use it in GitHub Desktop.
Parameter Space Exploration for Gradient Boosting Regressor on Air Quality Dataset
#Compound list
y_predict = ['PT08.S1(CO)','PT08.S2(NMHC)','PT08.S3(NOx)','PT08.S4(NO2)','PT08.S5(O3)']
#Select compound to optimize model for
compound = y_predict[0]
#Designate independent and dependent variables
x = aq_final.drop([compound], axis = 1)
y = aq_final[compound]
#Split data into test and training sets
X_train, X_test, y_train, y_test = train_test_split(x, y, test_size = 0.2, random_state = 42)
#Initiate Griadient Boosting Regressors for ML model
gb = GradientBoostingRegressor(loss = 'absolute_error', random_state=42, max_features = 'auto')
#Setup values for parameters
grid_values = {
'learning_rate':[1, 0.9, 0.8, 0.6, 0.5, 0.3, 0.1, 0.07, 0.05, 0.03, 0.01],
'n_estimators':[10, 30, 50, 70, 90, 110, 130, 150, 170, 190, 200],
'max_depth': [1,2,3,4,5,6,7,8,9,10,11],
}
#Do the grid search
grid_clf_acc = GridSearchCV(gb,
param_grid = grid_values,
cv=5,
scoring = 'r2',
n_jobs=4,
return_train_score=True,
verbose=4)
#Fit the mdoel using the results of the grid search
grid_clf_acc.fit(X_train, y_train)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment