Skip to content

Instantly share code, notes, and snippets.

@pierrelouisbescond
Last active August 21, 2020 14:37
Show Gist options
  • Save pierrelouisbescond/0f003abf4c19aebdf67d2eb52a50ee9d to your computer and use it in GitHub Desktop.
Save pierrelouisbescond/0f003abf4c19aebdf67d2eb52a50ee9d to your computer and use it in GitHub Desktop.
import pandas as pd
import numpy as np
# Import the CSV file with only useful columns
# source: https://www.data.gouv.fr/fr/datasets/temperature-quotidienne-departementale-depuis-janvier-2018/
df = pd.read_csv("temperature-quotidienne-departementale.csv", sep=";", usecols=[0,1,4])
# Rename columns to simplify syntax
df = df.rename(columns={"Code INSEE département": "Region", "TMax (°C)": "Temp"})
# Select 2019 records only
df = df[(df["Date"]>="2019-01-01") & (df["Date"]<="2019-12-31")]
# Pivot table to get "Date" as index and regions as columns
df = df.pivot(index='Date', columns='Region', values='Temp')
# Select a set of regions across France
df = df[["06","25","59","62","83","85","75"]]
display(df)
# Convert the Pandas dataframe to a Numpy array with time-series only
f = df.to_numpy().astype(float)
# Create a float vector between 0 and 1 for time index
time = np.linspace(0,1,len(f))
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment