Skip to content

Instantly share code, notes, and snippets.

@hotohoto
Created December 15, 2020 06:43
Show Gist options
  • Save hotohoto/2156647b6af0ba677e9a30f42e6feaee to your computer and use it in GitHub Desktop.
Save hotohoto/2156647b6af0ba677e9a30f42e6feaee to your computer and use it in GitHub Desktop.
Window a time series dataset with pandas
import pandas as pd
import numpy as np
input_csv = "./weather_history_tiny.csv"
output_csv = "./weather_history_tiny_windowed.csv"
df = pd.read_csv(input_csv)
n_lags = 1
N = len(df)
all_columns = df.columns
input_columns = [col for col in all_columns if col != "Datetime"]
for col in input_columns:
for i in range(n_lags):
new_col = f"{col}_T-{i}"
df[new_col] = np.hstack((np.full([i], np.nan), df[col].iloc[:N-i].to_numpy()))
df = df.drop(columns=all_columns)
df.to_csv(output_csv, index=False)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment