Skip to content

Instantly share code, notes, and snippets.

@AbdouSeck
Created October 16, 2018 21:27
Show Gist options
  • Save AbdouSeck/24995e9ad701a454fd84b5388e20c3e0 to your computer and use it in GitHub Desktop.
Save AbdouSeck/24995e9ad701a454fd84b5388e20c3e0 to your computer and use it in GitHub Desktop.
# The arbitrary and opinionated conversion of strings to datetime in Series comparisons
# When comparing two values where the left side values is a pandas datetime object,
# Pandas casts the left side values to pd.DatetimeIndex
# Once the casting is successfully, Pandas then calls the __eq__ method on the result of the cast operation.
# In this case, you want to take a look at pd.DatetimeIndex.__eq__ to figure out how the comparison is being handled
# The following is how pd.DatetimeIndex.__eq__ is defined (it's a decorated function, so the function name doesn't come up):
def wrapper(self, other):
func = getattr(super(DatetimeIndex, self), opname)
if isinstance(other, (datetime, compat.string_types)):
if isinstance(other, datetime):
# GH#18435 strings get a pass from tzawareness compat
self._assert_tzawareness_compat(other)
other = _to_m8(other, tz=self.tz)
result = func(other)
if isna(other):
result.fill(nat_result)
else:
if isinstance(other, list):
other = DatetimeIndex(other)
elif not isinstance(other, (np.datetime64, np.ndarray, Index, ABCSeries)):
# Following Timestamp convention, __eq__ is all-False
# and __ne__ is all True, others raise TypeError.
if opname == '__eq__':
return np.zeros(shape=self.shape, dtype=bool)
elif opname == '__ne__':
return np.ones(shape=self.shape, dtype=bool)
raise TypeError('%s type object %s' % (type(other), str(other)))
if is_datetimelike(other):
self._assert_tzawareness_compat(other)
result = func(np.asarray(other))
result = com._values_from_object(result)
# Make sure to pass an array to result[...]; indexing with
# Series breaks with older version of numpy
o_mask = np.array(isna(other))
if o_mask.any():
result[o_mask] = nat_result
if self.hasnans:
result[self._isnan] = nat_result
# support of bool dtype indexers
if is_bool_dtype(result):
return result
return Index(result)
# As you can see, the method checks that the right hand side value is either datetime or some form of string.
# If that's the case, then it converts the value to an M8 type (this is a numpy type for datetime objects).
# That is where the conversion occurs.
# The fact that it is not documentated is beyond me, since this is quite an opinionated thing to do.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment