Skip to content

Instantly share code, notes, and snippets.

View liannewriting's full-sized avatar

Lianne & Justin @ Just into Data liannewriting

View GitHub Profile
@liannewriting
liannewriting / plot-feature-importance.py
Created November 30, 2022 15:26
xgboost python machine learning
from xgboost import plot_importance
xgboost_step = opt.best_estimator_.steps[1]
xgboost_model = xgboost_step[1]
plot_importance(xgboost_model)
@liannewriting
liannewriting / print-best-estimator.py
Created November 30, 2022 15:25
xgboost python machine learning
opt.best_estimator_.steps
@liannewriting
liannewriting / predict-probability.py
Last active December 5, 2022 16:19
xgboost python machine learning
opt.predict(X_test)
opt.predict_proba(X_test)
@liannewriting
liannewriting / evaluate-score.py
Last active November 30, 2022 19:41
xgboost python machine learning
opt.best_score_
opt.score(X_test, y_test)
@liannewriting
liannewriting / print-best-estimator.py
Created November 30, 2022 15:23
xgboost python machine learning
opt.best_estimator_
@liannewriting
liannewriting / fit_xgboost.py
Created November 30, 2022 15:22
xgboost python machine learning
opt.fit(X_train, y_train)
@liannewriting
liannewriting / set-up-hyperparameter-tuning.py
Last active December 6, 2022 16:30
xgboost python machine learning
from skopt import BayesSearchCV
from skopt.space import Real, Categorical, Integer
search_space = {
'clf__max_depth': Integer(2,8),
'clf__learning_rate': Real(0.001, 1.0, prior='log-uniform'),
'clf__subsample': Real(0.5, 1.0),
'clf__colsample_bytree': Real(0.5, 1.0),
'clf__colsample_bylevel': Real(0.5, 1.0),
'clf__colsample_bynode' : Real(0.5, 1.0),
@liannewriting
liannewriting / set-up-pipeline.py
Last active December 7, 2022 15:54
xgboost python machine learning
from sklearn.pipeline import Pipeline
from category_encoders.target_encoder import TargetEncoder
from xgboost import XGBClassifier
estimators = [
('encoder', TargetEncoder()),
('clf', XGBClassifier(random_state=8)) # can customize objective function with the objective parameter
]
pipe = Pipeline(estimators)
pipe
@liannewriting
liannewriting / split_train_test.py
Last active December 1, 2022 00:01
xgboost python machine learning
from sklearn.model_selection import train_test_split
X = df.drop(columns='result')
y = df['result']
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, stratify=y, random_state=8)
@liannewriting
liannewriting / explore_data.py
Created November 30, 2022 15:05
xgboost python machine learning
df.info()
df['result'].value_counts()