Skip to content

Instantly share code, notes, and snippets.

Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save malcolmgreaves/8c8bb7c8c31abdd58da5ef4b1cb2d1ad to your computer and use it in GitHub Desktop.
Save malcolmgreaves/8c8bb7c8c31abdd58da5ef4b1cb2d1ad to your computer and use it in GitHub Desktop.
Demonstration showing a bug in Pandas: it automatically converts datetime columns into a different pandas-specific type, even when the original column has `dtype=object`.
from datetime import datetime
import pandas as pd
now = datetime.now()
df = pd.DataFrame.from_dict(
{
"created_at": pd.Series([now, now - timedelta(seconds=100), now + timedelta(seconds=10)], dtype='object'),
}
)
# this is True
assert df['created_at'].map(lambda x: isinstance(x, datetime)).all()
identity = df['created_at'].map(lambda x: x)
# THIS WILL FAIL!
# Even though we are doing _nothing_ to the columns values!
assert identity.map(lambda x: isinstance(x, datetime)).all()
# Pandas as silently changed the type of the column from datetime to datetime64[ns] :(
@malcolmgreaves
Copy link
Author

Behavior observed in pandas==1.5.3

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment