Skip to content

Instantly share code, notes, and snippets.

@amankharwal
Created December 22, 2020 08:14
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save amankharwal/60ec9ccc0a1f71804b3328863d199690 to your computer and use it in GitHub Desktop.
Save amankharwal/60ec9ccc0a1f71804b3328863d199690 to your computer and use it in GitHub Desktop.
# stratified sampling
from sklearn.model_selection import StratifiedShuffleSplit
split = StratifiedShuffleSplit(n_splits=1, test_size=0.2, random_state=42)
for train_index, test_index in split.split(housing, housing["income_cat"]):
strat_train_set = housing.loc[train_index]
strat_test_set = housing.loc[test_index]
for set_ in (strat_train_set, strat_test_set):
set_.drop('income_cat', axis=1, inplace=True)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment