Skip to content

Instantly share code, notes, and snippets.

@mGalarnyk
Last active July 3, 2022 03:10
Show Gist options
  • Star 2 You must be signed in to star a gist
  • Fork 1 You must be signed in to fork a gist
  • Save mGalarnyk/f42f434fc162be108a3bb5bc36464a59 to your computer and use it in GitHub Desktop.
Save mGalarnyk/f42f434fc162be108a3bb5bc36464a59 to your computer and use it in GitHub Desktop.
Choosing the right solver for a problem (logistic regression) can save a lot of time. Code from: https://medium.com/distributed-computing-with-ray/how-to-speed-up-scikit-learn-model-training-aaf17e2d1e1
import time
from sklearn.datasets import make_classification
from sklearn.model_selection import train_test_split
from sklearn.linear_model import LogisticRegression
# Set training and validation sets
X, y = make_classification(n_samples=1000000, n_features=1000, n_classes = 2)
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=10000)
# Solvers
solvers = ['liblinear', 'saga']
for sol in solvers:
start = time.time()
logreg = LogisticRegression(solver=sol)
logreg.fit(X_train, y_train)
end = time.time()
print(sol + " Fit Time: ",end-start)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment