Skip to content

Instantly share code, notes, and snippets.

Show Gist options
  • Save otaviomguerra/51df7a4cff28f92de7105f12a0724115 to your computer and use it in GitHub Desktop.
Save otaviomguerra/51df7a4cff28f92de7105f12a0724115 to your computer and use it in GitHub Desktop.
Simple decision tree classifier with Hyperparameter tuning using RandomizedSearch
# Import necessary modules
from scipy.stats import randint
from sklearn.tree import DecisionTreeClassifier
from sklearn.model_selection import RandomizedSearchCV
# Setup the parameters and distributions to sample from: param_dist
param_dist = {"max_depth": [3, None],
"max_features": randint(1, 9),
"min_samples_leaf": randint(1, 9),
"criterion": ["gini", "entropy"]}
# Instantiate a Decision Tree classifier: tree
tree = DecisionTreeClassifier()
# Instantiate the RandomizedSearchCV object: tree_cv
tree_cv = RandomizedSearchCV(tree, param_dist, cv=5)
# Fit it to the data
tree_cv.fit(X,y)
# Print the tuned parameters and score
print("Tuned Decision Tree Parameters: {}".format(tree_cv.best_params_))
print("Best score is {}".format(tree_cv.best_score_))
@ImanJowkar
Copy link

what exactly scipy.stats.randint does?
does it generate one random number between [1, 9] in this case?

@otaviomguerra
Copy link
Author

otaviomguerra commented May 29, 2021

what exactly scipy.stats.randint does?
does it generate one random number between [1, 9] in this case?

This is the parameter distribution, that distribution will be sampled and the result will generate a random integer between 1 and 9. For more info u can see: https://scikit-learn.org/stable/modules/generated/sklearn.model_selection.RandomizedSearchCV.html

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment