Skip to content

Instantly share code, notes, and snippets.

@theredpea
Created October 31, 2019 00:34
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save theredpea/feb7a15c875f97c7a6138528a163a7bc to your computer and use it in GitHub Desktop.
Save theredpea/feb7a15c875f97c7a6138528a163a7bc to your computer and use it in GitHub Desktop.
#Create a new field 'datetime', which converts strings to a datetime object
# datetime object will make our x-axis look good, showing appropriate labels like "Feb 2018" etc...
pyber_data_df['datetime'] = pd.to_datetime(pyber_data_df['date'])
# a datetime object *also* allows us to access the date part (2019-01-04) , ignoring the time-of-date part (05:39:20)
#Create another new field 'datedate' to store it
pyber_data_df['datedate'] = pyber_data_df['datetime'].dt.floor('d')
avg_fare_by_type_date = (pyber_data_df
#group by multiple fields by passing a list of those fields ;
#each field will become a "level" in the key which is distinct for each group
.groupby(['type', 'datedate'])
# get the average fare
.mean()['fare'])
(avg_fare_by_type_date
#Pivot the dates from rows to columns; pandas needs diff columns to plot;
#level 0 refers to unstacking the first level in our grouped-by-results;
#the first level (level=0) is the `type` of city; we want to pivot aka `unstack` city type to new columns
.unstack(level=0)
#Some dates are missing values; use pandas 'fillna' or 'interpolate' to decide what to do with those missing values
.fillna(method='ffill')
#This data is very specific, take a "rolling average" of each column; with a "window size" of 10 days:
.rolling(10).mean()
.plot())
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment