Skip to content

Instantly share code, notes, and snippets.

Embed
What would you like to do?
# stratified sampling
from sklearn.model_selection import StratifiedShuffleSplit
split = StratifiedShuffleSplit(n_splits=1, test_size=0.2, random_state=42)
for train_index, test_index in split.split(housing, housing["income_cat"]):
strat_train_set = housing.loc[train_index]
strat_test_set = housing.loc[test_index]
for set_ in (strat_train_set, strat_test_set):
set_.drop('income_cat', axis=1, inplace=True)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment