Skip to content

Instantly share code, notes, and snippets.

@vishalmehta1991
Last active September 21, 2021 21:30
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 1 You must be signed in to fork a gist
  • Save vishalmehta1991/ffaaa7483a20d2edc78a3e838ab92cdc to your computer and use it in GitHub Desktop.
Save vishalmehta1991/ffaaa7483a20d2edc78a3e838ab92cdc to your computer and use it in GitHub Desktop.
Multi GPU RF using DASK
from cuml.dask.ensemble import RandomForestClassifier as cuRF_mg
# cuml Random Forest params
cu_rf_params = {
‘n_estimators’: 25,
‘max_depth’: 13,
‘n_bins’: 15,
‘n_streams’: 8
}
# Start by setting up the CUDA cluster on the local host
cluster = LocalCUDACluster(threads_per_worker=1, n_workers=n_workers)
c = Client(cluster)
workers = c.has_what().keys()
# Shard the data across all workers
X_train_df, y_train_df = dask_utils.persist_across_workers(c,[X_train_df,y_train_df],workers=workers)
# Build and train the model
cu_rf_mg = cuRFC_mg(**cu_rf_params)
cu_rf_mg.fit(X_train_df, y_train_df)
# Check the accuracy on a test set
cu_rf_mg_predict = cu_rf_mg.predict(X_test)
acc_score = accuracy_score(cu_rf_mg_predict, y_test, normalize=True)
c.close()
cluster.close()
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment