Skip to content

Instantly share code, notes, and snippets.

@dnaranjo89
Forked from mariocj89/python-datetime.md
Last active April 16, 2017 09:56
Show Gist options
  • Save dnaranjo89/918ede12528fcde78a56e4b110015cb8 to your computer and use it in GitHub Desktop.
Save dnaranjo89/918ede12528fcde78a56e4b110015cb8 to your computer and use it in GitHub Desktop.

Intro

Most of us have faced a point when trying to make things work with python datetime module by just trying things around. datetime is one of those API that seems easy to use but requires the developer to have a deep understanding of what things actually mean, as otherwise it is really easy to introduce unexpected bugs given the actual complexity of date and time related issues.

Time Standards

The first concept we need to grasp when working with time that we need a standard that defines how we are going to measure it. The same way we have standards to measure weight or lenght that are based on kilograms or meters, we need a way to accurately define what a second is so then we can use other time references like days, weeks or years using a calendar standard.

UT1

One of the "simplest" ways to measure a second is just as a fraction of the day given that we can reliably guarantee that the sun will rise and set everyday (in most places). This gave birth to Universal Time (UT1), the successor of GMT. Nowadays we use stars and quasars to measure how long it takes to the earth to perform a full rotation. Even if this seems precise enough it still brings some issues along as due to the gravitation of the moon, tides and earthquakes the days change length along the year. Even if this is not an issue for most applications, it becomes non trivial problem when we require really precise measurements - GPS triangulation is a good example of a very time sensitive process, where a second offset result in a completly different location in the globe.

TAI

As a result the International Atomic Time (TAI) was designed to be as precise as possible. Using atomic clocks in multiple laboratories across the earth we get the most accurate and constant measure of the second which allows us to compute time intervals with the highest precision.

But this precision is both a blessing and a curse as TAI is so precise that it deviates from UT1 or what we call "Civil Time". This means that we will eventually have our clock noon deviate substantially from the "Solar noon".

UTC

That gave birth to UTC, which brought together the best of the two worlds. It uses the measurement of the second of TAI which allows for accurate measures of time and introduces leaps seconds, ensuring that it never deviates from UT1 more than 0.9 seconds.

How all this plays together in your computer

With all this information the reader should be able now to understand how the OS is serving the time to him at a specific moment.

Note that the computer has no atomic clock inside but uses an internal clock synchronized with the rest of the world via NTP.

The most common way in UNIX system is to use the POXIS time. Which is defined as the number of seconds since the unix epoch (1970) without taking leap seconds into account. As this time does not expose the leap second nor python does, some companies have defined their own way of handling the time via smearing the leap second across the time around it through their NTP server. See google time as an example.

Timezones

Credit: WikiMedia

We have seen what UTC is and how it allows us to define dates and times but countries like to have their wall time noon to match with the solar time for noon so the sun is on the top of the sky at 12pm. That is why UTC defines offsets so we can have 12am with an offset of +4h from UTC which effectively means that the actual time without offset is 8am. Governments define the standard offset from UTC that a geographical position follows, which effectively creates a timezone. The most comon database for timezones is known as the Olson Database. This can be retrieved in python using dateutil.tz.

>>> from dateutil.tz import gettz
>>> gettz("Europe/Madrid")

The result of gettz gives us an object that we can use to create timezone aware dates in python.

>>> import datetime as dt
>>> dt.datetime.now().isoformat()
'2017-04-15T14:16:56.551778'  # This is a naive datetime
>>> dt.datetime.now(gettz("Europe/Madrid")).isoformat()
'2017-04-15T14:17:01.256587+02:00'  # This is a tz aware datetime, always prefer these

We can see how the second time we get the current time via the now function of datetime we pass a tzinfo object and the offset is sticked into the ISO string representation of that datetime.

Should we want to use just plain UTC in Python3 we don't need any external library:

>>> dt.datetime.now(dt.timezone.utc).isoformat()
'2017-04-15T12:22:06.637355+00:00'

DST

Once we grasp all this knowledge we might feel prepared to work with timezones but we need to be aware of one more thing that happens only in some timezones, the Daylight Saving Time (DST). The countries that follow the DST will move their clocks one hour forward in sprint and one hour backwards in autumn back to the standard time of the timezone.

This effectively implies that a single timezone can have multiple offsets as we can see in the following example:

>>> dt.datetime(2017, 7, 1, tzinfo=dt.timezone.utc).astimezone(gettz())
'2017-07-01T02:00:00+02:00>>> dt.datetime(2017, 1, 1, tzinfo=dt.timezone.utc).astimezone(gettz())
'2017-01-01T01:00:00+01:00'

This gives us days that are made of 23 or 25 hours resulting in really interesting time arithmetics. Depending on the time and the timezone, adding a day does not necessarily means to add 24 hours.

>>> today = dt.datetime(2017, 10, 29, tzinfo=gettz("Europe/Madrid"))
>>> tomorrow = today + dt.timedelta(days=1)
>>> tomorrow.astimezone(dt.timezone.utc) - today.astimezone(dt.timezone.utc)
datetime.timedelta(1, 3600)  # We've added 25 hours  

The best strategy here -when working with timestamps- is to use non DST aware timezones (as recommended by UTC).

Wall times

After this the reader might be tempted to just convert all datetime objects to UTC and work only with UTC datetimes and fixed offsets. Even if this is by far the best approach for timestamps, it quickly breaks for future wall times.

We can distinguish two main types of time points. Wall times and timestamps. Timestamps are universal points in time not related to anywhere in particular. Examples include the time a star is born or when a line is logged to a file. But things change when we speak about the time "we read on the wall clock". When we say "see you tomorrow at two" we are not referring to UTC offsets but to tomorrow at 2pm on my local timezone not matter what the offset at this point is. We cannot just map those wall times to timestaps -we can for past ones- as for future occurrences countries might change their offset, which believe it or not happens more frequent that it seems.

For those situations we need to save the datetime with the timezone it refers to and not the offset.

Quick tips

After all this, how should we avoid the common issues when working with time?

  • Always use timezones, don't rely on local time.
  • Use dateutil/pytz to handle timezones.
  • If working with timestamps just use UTC.
  • Think twice what you actually want when doing time arithmetics.
  • Be aware of the multiple tweaks when calculating intervals (DST, leap seconds, calendar changes, etc.)
  • Test out with dates that contain DST changes.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment