Skip to content

Instantly share code, notes, and snippets.

@dottyz
Created May 2, 2019 18:36
Show Gist options
  • Save dottyz/78de513555dbb2bf70b8f9a94edb4105 to your computer and use it in GitHub Desktop.
Save dottyz/78de513555dbb2bf70b8f9a94edb4105 to your computer and use it in GitHub Desktop.
# Clean up column names for ease of use
df.columns = [' '.join(x.replace('trip_', '').replace('_seconds', '').split('_')).title() for x in df.columns]
df['Start Time'] = pd.to_datetime(df['Start Time'])
df['Date'] = df['Start Time'].apply(lambda x: x.strftime('%Y-%m-%d'))
df['Quarter'] = df['Start Time'].apply(lambda x: int((int(x.strftime('%m')) - 1) / 3) + 1)
df['Month'] = df['Start Time'].apply(lambda x: x.strftime('%B')).astype(month_type)
df['Day of Week'] = df['Start Time'].apply(lambda x: x.strftime('%a')).astype(day_type)
df['Hour'] = df['Start Time'].apply(lambda x: x.strftime('%H'))
# Create a "Route ID" in the "[start station ID]-[end station ID]" format
df['Route Id'] = df.apply(lambda x: '{0}-{1}'.format(int(x['Station Id From']), int(x['Station Id To'])), axis=1)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment