Skip to content

Instantly share code, notes, and snippets.

@dottyz
Last active May 3, 2019 14:22
Show Gist options
  • Save dottyz/206906b5c2872c913d420fc0df0bd7ce to your computer and use it in GitHub Desktop.
Save dottyz/206906b5c2872c913d420fc0df0bd7ce to your computer and use it in GitHub Desktop.
# Import the weather data and drop the first 22 rows (containing descriptions of the weather station)
weather = pd.read_csv('./data/weather.csv', header=22)
# Remove units contained in the column names (eg. Celcius, mm, etc.)
weather.columns = [re.sub(r'\([^()]*\)', '', x).strip() if x != 'Date/Time' else 'Date' for x in weather.columns]
data = df.groupby(['Date', 'User Type'])['Id'].nunique().to_frame().pivot_table(index='Date', columns='User Type').reset_index()
data.columns = ['Date', 'Casual Trips', 'Member Trips']
data = data.merge(weather[['Date', 'Mean Temp', 'Total Precip']], on='Date', how='inner')
g = sns.pairplot(data, diag_kind='kde', plot_kws={'s': 10})
g.fig.set_size_inches(10, 10)
fig, ax = plt.subplots(figsize=(12, 10))
corr = data.corr()
mask = np.zeros_like(corr)
mask[np.triu_indices_from(mask)] = True
sns.heatmap(corr, mask=mask, annot=True, cmap=sns.diverging_palette(240, 10, as_cmap=True), center=0)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment