Skip to content

Instantly share code, notes, and snippets.

@pganssle
Forked from mariocj89/python-datetime.md
Created April 18, 2017 20:32
Show Gist options
  • Save pganssle/eb5fd17d2f29894459256ee30d34c824 to your computer and use it in GitHub Desktop.
Save pganssle/eb5fd17d2f29894459256ee30d34c824 to your computer and use it in GitHub Desktop.

Intro

Most of us have faced a point when trying to make things work with the Python datetime module by just trying things around. Datetime is one of those APIs that seems easy to use but requires the developer to have a deep understanding of what things actually mean, as otherwise it is really easy to introduce unexpected bugs given the actual complexity of date and time related issues.

Time Standards

The first concept we need to grasp when working with time is a standard that defines how we can measure units of time. The same way we have standards to measure weight or length that define kilograms or meters, we need a way to accurately define what a second means. We can then use other time references like days, weeks or years using a calendar standard as multiples of the second.

UT1

One of the "simplest" ways to measure a second is just as a fraction of the day given that we can reliably guarantee that the sun will rise and set everyday (in most places). This gave birth to Universal Time (UT1), the successor of GMT. Nowadays we use stars and quasars to measure how long it takes for the earth to perform a full rotation. Even if this seems precise enough it still brings some issues along as due to the gravitation of the moon, tides and earthquakes the days change length along the year. Even if this is not an issue for most applications, it becomes a non trivial problem when we require really precise measurements - GPS triangulation is a good example of a very time sensitive process, where a second offset result in a completly different location on the globe.

TAI

As a result the International Atomic Time (TAI) was designed to be as precise as possible. Using atomic clocks in multiple laboratories across the earth we get the most accurate and constant measure of the second which allows us to compute time intervals with the highest precision.

But this precision is both a blessing and a curse as TAI is so precise that it deviates from UT1 or what we call "Civil Time". This means that we will eventually have our clock noon deviate substantially from the solar noon.

UTC

That gave birth to Coordinated Universal Time (UTC), which brought together the best of the two worlds. It uses the measurement of a second as defined by TAI which allows for accurate measures of time and introduces leap seconds, ensuring that it never deviates from UT1 by more than 0.9 seconds.

How all this plays together in your computer

With all this information the reader should be able now to understand how the OS is serving the time to them at a specific moment.

Note that the computer has no atomic clock inside but uses an internal clock synchronized with the rest of the world via NTP.

The most common way in Unix-like systems is to use the POSIX time. Which is defined as the number of seconds since the Unix epoch (1970) without taking leap seconds into account. As this time does not expose the leap second nor does Python, some companies have defined their own way of handling the time by smearing the leap second across the time around it through their NTP server. See Google time as an example.

Timezones

Credit: WikiMedia

We have seen what UTC is and how it allows us to define dates and times but countries like to have their wall time noon to match with the solar time for noon so the sun is on the top of the sky at 12pm. That is why UTC defines offsets so we can have 12am with an offset of +4 hours from UTC which effectively means that the actual time without offset is 8am. Governments define the standard offset from UTC that a geographical position follows, which effectively creates a timezone. The most comon database for timezones is known as the Olson Database. This can be retrieved in Python using dateutil.tz.

>>> from dateutil.tz import gettz
>>> gettz("Europe/Madrid")

The result of gettz gives us an object that we can use to create timezone aware dates in Python.

>>> import datetime as dt
>>> dt.datetime.now().isoformat()
'2017-04-15T14:16:56.551778'  # This is a naive datetime
>>> dt.datetime.now(gettz("Europe/Madrid")).isoformat()
'2017-04-15T14:17:01.256587+02:00'  # This is a tz aware datetime, always prefer these

We can see how the second time we get the current time via the now function of datetime we pass a tzinfo object and the offset is sticked into the ISO string representation of that datetime.

Should we want to use just plain UTC in Python 3 we don't need any external libraries:

>>> dt.datetime.now(dt.timezone.utc).isoformat()
'2017-04-15T12:22:06.637355+00:00'

DST

Once we grasp all this knowledge we might feel prepared to work with timezones but we need to be aware of one more thing that happens only in some timezones: Daylight Saving Time (DST). The countries that follow the DST will move their clocks one hour forward in sprint and one hour backwards in autumn back to the standard time of the timezone.

This effectively implies that a single timezone can have multiple offsets as we can see in the following example:

>>> dt.datetime(2017, 7, 1, tzinfo=dt.timezone.utc).astimezone(gettz())
'2017-07-01T02:00:00+02:00>>> dt.datetime(2017, 1, 1, tzinfo=dt.timezone.utc).astimezone(gettz())
'2017-01-01T01:00:00+01:00'

This gives us days that are made of 23 or 25 hours resulting in really interesting time arithmetics. Depending on the time and the timezone, adding a day does not necessarily mean to add 24 hours.

>>> today = dt.datetime(2017, 10, 29, tzinfo=gettz("Europe/Madrid"))
>>> tomorrow = today + dt.timedelta(days=1)
>>> tomorrow.astimezone(dt.timezone.utc) - today.astimezone(dt.timezone.utc)
datetime.timedelta(1, 3600)  # We've added 25 hours

The best strategy here -when working with timestamps- is to use non DST-aware timezones (ideally UTC+00:00).

Wall times

After this the reader might be tempted to just convert all datetime objects to UTC and work only with UTC datetimes and fixed offsets. Even if this is by far the best approach for timestamps, it quickly breaks for future wall times.

We can distinguish two main types of time points. Wall times and timestamps. Timestamps are universal points in time not related to anywhere in particular. Examples include the time a star is born or when a line is logged to a file. But things change when we speak about the time "we read on the wall clock". When we say "see you tomorrow at two" we are not referring to UTC offsets but to tomorrow at 2pm on my local timezone not matter what the offset at this point is. We cannot just map those wall times to timestaps -we can for past ones- as for future occurrences countries might change their offset, which believe it or not happens more frequent than it seems.

For those situations we need to save the datetime with the timezone it refers to and not the offset.

Quick tips

After all this, how should we avoid the common issues when working with time?

  • Always use timezones, don't rely on implicit local timezone.
  • Use dateutil/pytz to handle timezones.
  • Always use UTC if working with timestamps.
  • Remember that a day is not always made of 24h for some timezones.
  • Keep up to date your timezone database.
  • Always test your code against situations like DST changes.

Libraries worth mentioning

  • dateutil: Multiple utilities to work with time
  • freezegun: Easier testing of time related applications
  • arrow/pendulum: Drop in replacement of the standard datetime module
  • astropy: Useful for astronomical times and working with leap seconds
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment