Skip to content

Instantly share code, notes, and snippets.

@alep
Created February 13, 2015 19:42
Show Gist options
  • Save alep/48a48f4332b185fc6cca to your computer and use it in GitHub Desktop.
Save alep/48a48f4332b185fc6cca to your computer and use it in GitHub Desktop.
Strange results when pivoting, doesnot group by day as expected.
In [1]: %load test.py
In [2]: import pandas as pd
import numpy as np
frame = pd.read_csv("table.csv", engine="python", parse_dates=['since'])
print frame
d = pd.pivot_table(frame, index=pd.TimeGrouper(key='since', freq='1d'), values=["value"], columns=['id'], aggfunc=np.sum, fill_value=0)
print d
print "^that is not what I expected"
frame = pd.read_csv("table2.csv", engine="python", parse_dates=['since']) # add some values to a day
print frame
d = pd.pivot_table(frame, index=pd.TimeGrouper(key='since', freq='1d'), values=["value"], columns=['id'], aggfunc=np.sum, fill_value=0)
print d
...:
/home/aperalta/.virtualenvs/xapo/local/lib/python2.7/site-packages/pandas/io/excel.py:626: UserWarning: Installed openpyxl is not supported at this time. Use >=1.6.1 and <2.0.0.
.format(openpyxl_compat.start_ver, openpyxl_compat.stop_ver))
id since value
0 81 2015-01-31 07:00:00 2200
1 81 2015-02-01 07:00:00 2200
id value
<pandas.tseries.resample.TimeGrouper object at 0x7fc595f96c10> 81 2200
id 81 2200
^that is not what I expected
id since value
0 81 2015-01-31 07:00:00 2200
1 81 2015-01-31 08:00:00 2200
2 81 2015-01-31 09:00:00 2200
3 81 2015-02-01 07:00:00 2200
value
id 81
since
2015-01-31 6600
2015-02-01 2200
id since value
81 2015-01-31 07:00:00+00:00 2200.0000
81 2015-02-01 07:00:00+00:00 2200.0000
id since value
81 2015-01-31 07:00:00+00:00 2200.0000
81 2015-01-31 08:00:00+00:00 2200.0000
81 2015-01-31 09:00:00+00:00 2200.0000
81 2015-02-01 07:00:00+00:00 2200.0000
# coding: utf-8
import pandas as pd
import numpy as np
frame = pd.read_csv("table.csv", engine="python", parse_dates=['since'])
print frame
d = pd.pivot_table(frame, index=pd.TimeGrouper(key='since', freq='1d'), values=["value"], columns=['id'], aggfunc=np.sum, fill_value=0)
print d
print "^that is not what I expected"
frame = pd.read_csv("table2.csv", engine="python", parse_dates=['since']) # add some values to a day
print frame
d = pd.pivot_table(frame, index=pd.TimeGrouper(key='since', freq='1d'), values=["value"], columns=['id'], aggfunc=np.sum, fill_value=0)
print d
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment