Skip to content

Instantly share code, notes, and snippets.

@slchangtw
Last active August 13, 2018 10:29
Show Gist options
  • Save slchangtw/3242e194051025a266049c87633a5335 to your computer and use it in GitHub Desktop.
Save slchangtw/3242e194051025a266049c87633a5335 to your computer and use it in GitHub Desktop.
Filling zeros in time series data
import numpy as np
import pandas as pd
from datetime import datetime
# make an example
np.random.seed(0)
item = np.random.choice(['A', 'B'], 10)
year = np.random.choice([2016, 2017], 10)
month = np.random.choice(range(1, 13), 10, replace=False)
order = np.random.randint(low=1, high=10, size=10)
df = pd.DataFrame({'item': item,
'year': year,
'month': month,
'order': order})
# create index based on item and year_month
df['year_month'] = df.apply(lambda row: datetime(row['year'], row['month'], 1), axis=1)
df = df.set_index(['item', 'year_month'])
df = df.drop(['year', 'month'], axis=1)
# create new index
item_index = df.index.levels[0]
date_index = pd.date_range('2016/1/1', periods=24, freq='MS')
iterable = [item_index, date_index]
new_index = pd.MultiIndex.from_product(iterable, names=['item', 'date'])
# reindex, if NaN, fill 0
df = df.reindex(new_index, fill_value=0)
# unstack
new_df = df.unstack('item', fill_value=0)
@dlainfiesta
Copy link

Very nice! 👍

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment