Skip to content

Instantly share code, notes, and snippets.

@rikturr
Created October 13, 2020 19:55
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save rikturr/d0a20653ad251d52c5227a3d33ce2128 to your computer and use it in GitHub Desktop.
Save rikturr/d0a20653ad251d52c5227a3d33ce2128 to your computer and use it in GitHub Desktop.
dask-rapids
# notice "dask" in these imports
import dask_cudf
from cuml.dask.ensemble import RandomForestClassifier
taxi = dask_cudf.read_csv(
's3://nyc-tlc/trip data/yellow_tripdata_2019-01.csv',
parse_dates=['tpep_pickup_datetime', 'tpep_dropoff_datetime'],
storage_options={'anon': True},
assume_missing=True,
)
taxi_train = prep_df(taxi)
rfc = RandomForestClassifier(n_estimators=100, max_depth=10, seed=42)
rfc.fit(taxi_train[features], taxi_train[y_col])
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment