Skip to content

Instantly share code, notes, and snippets.

@StephenFordham
Last active October 12, 2020 19:27
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save StephenFordham/be069a5b95f5b1a805a8908ce4f3ead4 to your computer and use it in GitHub Desktop.
Save StephenFordham/be069a5b95f5b1a805a8908ce4f3ead4 to your computer and use it in GitHub Desktop.
Testing strategies
results =[]
strategies = ['mean', 'median', 'most_frequent','constant']
for s in strategies:
pipeline = Pipeline([('impute', SimpleImputer(strategy=s)),('model', model)])
cv = RepeatedStratifiedKFold(n_splits=10, n_repeats=3, random_state=1)
scores = cross_val_score(pipeline, X, y, scoring='accuracy', cv=cv, n_jobs=-1)
results.append(scores)
for method, accuracy in zip(strategies, results):
print('Method: {0}, mean accuracy: = {1:.3f}, max accuracy: {2:.3f}'.format(method, np.mean(accuracy), np.max(accuracy)))
# Output:
# Method: mean, mean accuracy: = 0.849, max accuracy: 0.858
# Method: median, mean accuracy: = 0.848, max accuracy: 0.858
# Method: most_frequent, mean accuracy: = 0.848, max accuracy: 0.861
# Method: constant, mean accuracy: = 0.849, max accuracy: 0.868
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment