Skip to content

Instantly share code, notes, and snippets.

@rikturr
Created July 30, 2020 20:13
Show Gist options
  • Save rikturr/58ac3cdcacc910bbcff4cf173c92bb6e to your computer and use it in GitHub Desktop.
Save rikturr/58ac3cdcacc910bbcff4cf173c92bb6e to your computer and use it in GitHub Desktop.
dask_features
from dask import persist
from dask.distributed import wait
taxi['pickup_weekday'] = taxi.tpep_pickup_datetime.dt.weekday
taxi['pickup_hour'] = taxi.tpep_pickup_datetime.dt.hour
taxi['pickup_minute'] = taxi.tpep_pickup_datetime.dt.minute
taxi['pickup_week_hour'] = (taxi.pickup_weekday * 24) + taxi.pickup_hour
taxi['store_and_fwd_flag'] = (taxi.store_and_fwd_flag == 'Y').astype(float)
taxi = taxi.fillna(-1)
X = taxi[features].astype('float32')
y = taxi['total_amount']
X, y = persist(X, y)
_ = wait([X, y])
len(X)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment