Working with DATE
, TIMESTAMP
, and INTERVAL
in PostgreSQL can be confusing. In this article I will go over the three date/time related data types, and the two most useful date/time functions: DATE_PART
and DATE_TRUNC
. Finally, I will provide some real life examples of how these types and functions can be used within queries.
PostgreSQL Date/Time Documentation
The DATE
type contains the year
, month
, and day
of a date. It is not possible to do any type of time
related functions on a DATE
without first converting it to a TIMESTAMP
. Subtracting two DATE
values from one another results in an INT
representing the # of days between.
The TIMESTAMP
type contains a year
, month
, day
, hour
, minute
, second
, and microsecond
. This is the type that I most often use.
TIMESTAMP
can be WITH TIME ZONE
or WITHOUT TIME ZONE
. The default TIMESTAMP
is WITHOUT TIME ZONE
and WITH TIME ZONE
can be abbreviated as TIMESTAMPZ
.
Internally TIMESTAMP
is stored in UTC
regardless of if it is WITH TIME ZONE
or WITHOUT TIME ZONE
. For columns or values of the WITH TIME ZONE
type, the value returned to you by the server depends only on the TIME ZONE
of the client session.
The INTERVAL
type refers to the difference between two dates. It can be constructed literally, or by subtracting a TIMESTAMP
value from another. An interval can contain years
, months
, weeks
, days
, hours
, seconds
, and microseconds
. Each part can be either positive or negative. However not all of these units play nicely together.
years
and months
can be rolled up together, weeks
are rolled into days
, and hours
, minutes
, seconds
, and milliseconds
can be rolled up together. However these three groups cannot be rolled up together and remain distinct within the interval.
Here are some examples of how this effects intervals:
INTERVAL '1 year 1 month' -- '1 year 1 mon'
INTERVAL '1 year 14 months' -- '2 years 2 mons'
INTERVAL '1 week' -- '7 days'
INTERVAL '1 week 6 days' -- '13 days'
INTERVAL '52 weeks' -- '364 days'
INTERVAL '366 days' -- '366 days'
INTERVAL '1 hour 1 minute' -- '01:01:00'
INTERVAL '1 hour 61 minutes' -- '02:01:00'
INTERVAL '48 hours 60 minutes' -- '49:00:00
INTERVAL '1 year 365 days 24 hours' -- '1 year 365 days 24:00:00'
INTERVAL '60 weeks 48 hours 12 minutes' -- '420 days 48:12:00'
INTERVAL
values can be added and subtracted from other INTERVAL
values as well as by DATE
and TIMESTAMP
values. They can also be created by subtracting two TIMESTAMP
values.
INTERVAL + INTERVAL = INTERVAL
INTERVAL + DATE = TIMESTAMP
INTERVAL + TIMESTAMP = TIMESTAMP
DATE - DATE = INTEGER
DATE - TIMESTAMP = INTERVAL
TIMESTAMP - TIMESTAMP = INTERVAL
INTERVAL '1 month' + INTERVAL '2 months' -- '3 mons'
INTERVAL '1 year 24 hours' + INTERVAL '1 year 3 days' -- '2 year 3 days 24:00:00'
INTERVAL '1 year 1 day' - INTERVAL '24:00:00' -- '1 year 1 day -24:00:00'
INTERVAL '11 months' + INTERVAL '1 month 1 day' -- '1 year 1 day'
When creating intervals by subtracting two TIMESTAMP
values, a year
or month
value will never be present. The interval will have the number of days
, hours
, minutes
, seconds
, and milliseconds
between the two TIMESTAMP
values. In this case hours
will be rolled up into days
.
'2023-01-01'::timestamp - '2022-01-01'::timestamp -- '365 days'
'2021-01-01'::timestamp - '2020-01-01'::timestamp -- '366 days'
'2023-01-15'::timestamp - '2023-01-10'::timestamp -- '5 days'
'2023-02-01'::timestamp - '2023-01-01'::timestamp -- '31 days'
'2023-01-15 05:00:00'::timestamp - '2023-01-15 00:00:00'::timestamp -- '05:00:00'
'2023-01-18 05:00:00'::timestamp - '2023-01-15 00:00:00'::timestamp -- '3 days 05:00:00'
INTERVAL
values can be multiplied and divided by INT
and FLOAT
values. This is straightforward when multiplying by an INT
.
INTERVAL '1 month' * 2 -- '2 mons'
INTERVAL '1 year 6 months' * 2 -- '3 years'
INTERVAL '5 hours 5 minutes' * 2 -- '10:10:00'
INTERVAL '15 days 24 hours' * 3 -- '45 days 72:00:00'
INTERVAL '1 month 15 days' * 3 -- '3 mons 45 days'
INTERVAL '1 month 15 days' * 12 -- '1 year 180 days'
However when dividing by an INT
or FLOAT
value, the values can get distorted due to the indeterminate conversion from year
to day
, month
to day
, or week
to year
. (not every year has 365 days, not every month has 30 days, etc...).
This will not result in an error, but the results are not always accurate.
INTERVAL '1 month' / 30 -- '1 day'
INTERVAL '1 year' / 365 -- '23:40:16.4064'
INTERVAL '1 year' / 12 -- '1 mon'
INTERVAL '1 month' / 4 -- '7 days 12:00:00'
INTERVAL '1 month' * 0.5 -- '15 days'
INTERVAL '1 min' * 0.5 -- '00:00:30'
INTERVAL '3 months 30 days' / 30 -- '4 days'
INTERVAL
values can be compared using the =
, >=
, <
, and <=
operators. However you must be careful when doing this for the same reasons as INTERVAL
division. Comparisons between INTERVAL
values containing year
and month
and INTERVAL
values containing day
, hour
, minute
, second
, microsecond
values cannot always be trusted as they rely on not always correct assumptions about the conversion from year
to day
and month
to day
.
This will not result in an error, but the results are not always accurate.
INTERVAL '1 month' < INTERVAL '2 months' -- TRUE
INTERVAL '1 months 30 days' = INTERVAL '2 months' -- TRUE
INTERVAL '1 year' > INTERVAL '365 days' -- FALSE
INTERVAL '12 months 5 days' = INTERVAL '1 year 5 days' -- TRUE
INTERVAL '52 weeks' > INTERVAL '1 year' -- TRUE
INTERVAL '365 days' > INTERVAL '1 year' -- TRUE
INTERVAL '365 days' = INTERVAL '1 year' -- FALSE
The DATE_TRUNC
function is in my opinion one of the most useful functions when it comes to writing business intelligence or analytical SQL. It is analogous to the mathematical floor
function, where instead of specifying the # of decimal places, you specify the part of the date.
This function has the signature DATE_TRUNC(<date_part>, timestamp/date)
where <date_part>
is a part of a date: (decade
, year
, quarter
, month
, week
, day
, hour
, minute
, second
). The function truncates the given timestamp to the beginning of the given date_part
given.
DATE_TRUNC('year', '2023-02-14 05:06:07'::timestamp) -- '2023-01-01 00:00:00'
DATE_TRUNC('quarter', '2023-02-14 05:06:07'::timestamp) -- '2023-01-01 00:00:00'
DATE_TRUNC('month', '2023-02-14 05:06:07'::timestamp) -- '2023-02-01 00:00:00'
DATE_TRUNC('week', '2023-02-14 05:06:07'::timestamp) -- '2023-02-13 00:00:00'
DATE_TRUNC('day', '2023-02-14 05:06:07'::timestamp) -- '2023-02-14 00:00:00'
DATE_TRUNC('hour', '2023-02-14 05:06:07'::timestamp) -- '2023-02-14 05:00:00'
DATE_TRUNC('minute', '2023-02-14 05:06:07'::timestamp) -- '2023-02-14 05:06:00'
DATE_TRUNC('second', '2023-02-14 05:06:07'::timestamp) -- '2023-02-14 05:06:07'
Note about the week
date_part. The first day of the week is Monday
.
The DATE_TRUNC
function also works with intervals. Because this is a floor
function, if the given interval is less than the date_part
used then a zero interval is returned.
DATE_TRUNC('year', INTERVAL '5 days') -- '00:00:00'
DATE_TRUNC('year', INTERVAL '1 year 2 months 3 days 4 hours 5 minutes 6 seconds') -- '1 year'
DATE_TRUNC('month', INTERVAL '1 year 2 months 3 days 4 hours 5 minutes 6 seconds') -- '1 year 2 mons'
DATE_TRUNC('day', INTERVAL '1 year 2 months 3 days 4 hours 5 minutes 6 seconds') -- '1 year 2 mons 3 days'
DATE_TRUNC('hour', INTERVAL '1 year 2 months 3 days 4 hours 5 minutes 6 seconds') -- '1 year 2 mons 3 days 04:00:00'
DATE_TRUNC('minute', INTERVAL '1 year 2 months 3 days 4 hours 5 minutes 6 seconds') -- '1 year 2 mons 3 days 04:05:00'
DATE_TRUNC('second', INTERVAL '1 year 2 months 3 days 4 hours 5 minutes 6 seconds') -- '1 year 2 mons 3 days 04:05:06'
You cannot use the date_part
of week
when using DATE_TRUNC
on an INTERVAL
.
The DATE_PART
function can also be very useful. PostgreSQL provides two very similar functions DATE_PART
and EXTRACT
with different syntax, but identical (DATE_PART
returns a double
, which can lead to some loss of precision) behavior. I prefer to use DATE_PART
as it matches the syntax used by DATE_TRUNC
.
The DATE_PART
function extracts a part or derivative value of the date/time from a given DATE
, TIMESTAMP
, or INTERVAL
value.
This function has the signature DATE_PART(<date_part>, timestamp/date/interval)
where <date_part>
is a part of a date/timestamp: (year
, quarter
, month
, day
, hour
, minute
, second
), or it is a derivative value of the DATE
or TIMESTAMP
value.
The date_part
derivative values are as follows:
century
: The century of the givenDATE
orTIMESTAMP
(We are in the21st
Century in 2023)decade
: Theyear
/ 10dow
: Day of week fromSunday
(0
) toSaturday
(6
)doy
: Day of year from1
to365
/366
epoch
: The number of seconds since theepoch
(1970-01-01 00:00:00
). Or the total # of seconds in anINTERVAL
isodow
: The ISO 8601 day of the week fromMonday
(1
) toSunday
(7
)isoyear
: The ISO 8601 week numbering year. Very early on in the year it may differ from thegregorian
year.week
: The ISO 8601 week number. Week1
of a given year containsJanuary 4
. This means that very early on in the year you can be in week53
of the previous year.timezone
: The timezone offset in seconds.timezone_hour
: Thehour
component of the timezone offset.timezone_minute
: Theminute
component of the timezone offset.
DATE_PART('year', '2023-02-14 05:06:07'::timestamp) -- 2023
DATE_PART('quarter', '2023-02-14 05:06:07'::timestamp) -- 1
DATE_PART('month', '2023-02-14 05:06:07'::timestamp) -- 2
DATE_PART('day', '2023-02-14 05:06:07'::timestamp) -- 14
DATE_PART('hour', '2023-02-14 05:06:07'::timestamp) -- 5
DATE_PART('minute', '2023-02-14 05:06:07'::timestamp) -- 6
DATE_PART('second', '2023-02-14 05:06:07'::timestamp) -- 7
DATE_PART('century', '2023-02-14 05:06:07'::timestamp) -- 21
DATE_PART('decade', '2023-02-14 05:06:07'::timestamp) -- 202
DATE_PART('dow', '2023-02-14 05:06:07'::timestamp) -- 2
DATE_PART('doy', '2023-01-01 05:06:07'::timestamp) -- 1
DATE_PART('doy', '2023-02-14 05:06:07'::timestamp) -- 45
DATE_PART('epoch', '2023-02-14 05:06:07'::timestamp) -- 1676351167
DATE_PART('epoch', '1970-01-01 00:00:00'::timestamp) -- 0
DATE_PART('isodow', '2023-02-14 05:06:07'::timestamp) -- 2
DATE_PART('week', '2023-02-14 05:06:07'::timestamp) -- 7
DATE_PART('week', '2005-01-01 05:06:07'::timestamp) -- 53
DATE_PART('isoyear', '2023-01-01 05:06:07'::timestamp) -- 2022
DATE_PART('isoyear', '2023-02-14 05:06:07'::timestamp) -- 2023
-- The timezone values change according to your client settings, not the value given.
-- My client is connected at -5:00 so my values will reflect this.
DATE_PART('timezone', '2023-02-14 05:06:07 -8:00'::timestamp with time zone) -- -18000
DATE_PART('timezone_hour', '2023-02-14 05:06:07 -8:00'::timestamp with time zone) -- -5
DATE_PART('timezone_minute', '2023-02-14 05:06:07 -8:00'::timestamp with time zone) -- 0
You must be careful when using DATE_PART
from INTERVAL
values. INTERVAL
values store their information in parts: year/month
, week/day
, and hour/min/sec
and perform rollups within those parts. Thus when extracting a part from an INTERVAL
the results might not tell the full story of the INTERVAL
.
Certain DATE_PARTS
are not available for intervals: (dow
, doy
, isodow
, isoyear
, week
, timezone
, timezone_hour
, timezone_minute
)
DATE_PART('year', INTERVAL '2 years 1 month') -- 2
DATE_PART('year', INTERVAL '12 months') -- 1
DATE_PART('year', INTERVAL '1 year 13 months') -- 2
DATE_PART('year', INTERVAL '1 year 366 days') -- 1
DATE_PART('month', INTERVAL '11 months') -- 11
DATE_PART('month', INTERVAL '1 year 13 months') -- 1
DATE_PART('month', INTERVAL '12 months') -- 0
DATE_PART('day', INTERVAL '31 days 4 hours') -- 31
DATE_PART('day', INTERVAL '2 weeks') -- 14
DATE_PART('day', INTERVAL '2 weeks 8 days') -- 22
DATE_PART('day', INTERVAL '8 days 24 hours') -- 8
DATE_PART('hour', INTERVAL '31 days 4 hours') -- 4
DATE_PART('hour', INTERVAL '4 hours 5 minutes') -- 4
DATE_PART('hour', INTERVAL '4 hours 61 minutes') -- 5
DATE_PART('minutes', INTERVAL '40 minutes') -- 40
DATE_PART('minutes', INTERVAL '65 minutes') -- 5
DATE_PART('minutes', INTERVAL '40 minutes 70 seconds') -- 41
DATE_PART('epoch', INTERVAL '1 minute') -- 60
DATE_PART('epoch', INTERVAL '1 minute 20 seconds') -- 80
DATE_PART('epoch', INTERVAL '1 day 1 hour 3 minutes') -- 90180
-- be careful when using epoch with intervals that contain months or years as they will use estimations.
DATE_PART('epoch', INTERVAL '1 month') -- 2592000
DATE_PART('epoch', INTERVAL '1 year') -- 31557600
The easiest way to query and aggregate data in a time-series format is with the DATE_TRUNC
function. This way data from multiple years can be easily combined.
SELECT DATE_TRUNC('month', o.timestamp_col) AS date_month
, SUM(o.total_price) AS revenue
FROM orders o
GROUP BY 1
ORDER BY 1 ASC;
I will leverage DATE_TRUNC
to first aggregate the data, and then I will utilize an INTERVAL
in my join to align the consecutive rows. This will allow data from multiple years to be queried together.
WITH monthly_revenue AS (
SELECT DATE_TRUNC('month', o.timestamp_col) AS date_month
, SUM(o.total_price) AS revenue
FROM orders o
GROUP BY 1
)
SELECT current.date_month
, current.revenue AS revenue
, previous.revenue AS prev_revenue
, (current.revenue - previous.revenue) / NULLIF(previous.revenue, 0) AS pct_change
FROM monthly_revenue current
LEFT JOIN monthly_revenue previous ON
current.date_month - INTERVAL '1 month' = previous.date_month
ORDER BY 1 ASC;
When looking at data aggregated by week, it is very difficult to do a Year-over-Year comparison. To do this we must leverage the ISO 8601 week of year convention. Each week in a year is given a number from 1 to 53. We can extract the week
and year
using DATE_PART
to align the rows.
WITH weekly_revenue AS (
SELECT DATE_PART('week', o.timestamp_col) AS week_of_year
, DATE_PART('year', o.timestamp_col) AS year
, SUM(o.total_price) AS revenue
FROM orders o
GROUP BY 1, 2
)
SELECT current.week_of_year
, current.year
, current.revenue AS revenue
, previous.revenue AS prev_revenue
, (current.revenue - previous.revenue) / NULLIF(previous.revenue, 0) AS pct_change
FROM weekly_revenue current
LEFT JOIN weekly_revenue previous ON
current.week_of_year = previous.week_of_year
AND current.year - 1 = previous.year
ORDER BY 1 ASC;
Some companies have quarters which are offset from the conventional Q1
starting on Janaury 1
. For instance, Q1 2024
might start on 2023-02-01
. We can use a combination of INTERVAL
and DATE_PART
to accommodate for this.
SELECT DATE_PART('year', o.timestamp_col + INTERVAL '1 year -1 month') AS year
, DATE_PART('quarter', o.timestamp_col + INTERVAL '1 year -1 month') AS quarter
, SUM(o.total_price) AS revenue
FROM orders o
GROUP BY 1, 2
ORDER BY 2 ASC, 1 ASC;
Given a table of timestamped page views, we can leverage TIMESTAMP
differences and DATE_PART
INTERVAL
comparisons to create sessions
from these page views. We will define the end of a session as a duration of longer than 20 minutes between page views.
WITH lagged_lead_views AS (
SELECT v.id
, v.timestamp_col
, v.user_id
, LAG(v.timestamp_col)
OVER (
PARTITION BY v.user_id
ORDER BY v.timestamp_col ASC
) AS previous_view_timestamp
, LEAD(v.timestamp_col)
OVER (
PARTITION BY v.user_id
ORDER BY v.timestamp_col ASC
) AS next_view_timestamp
FROM views v
), session_start_views AS (
SELECT lv.id
, lv.timestamp_col as session_start_timestamp
, lv.user_id
, ROW_NUMBER()
OVER (
PARTITION BY lv.user_id
ORDER BY lv.timestamp_col ASC
) AS user_session_id
FROM lagged_lead_views lv
WHERE lv.previous_view_timestamp is NULL
OR DATE_PART('epoch', lv.timestamp_col - lv.previous_view_timestamp) > (20 * 60) -- 20 minutes in seconds
), session_end_views AS (
SELECT lv.id
, lv.timestamp_col as session_end_timestamp
, lv.user_id
, ROW_NUMBER()
OVER (
PARTITION BY lv.user_id
ORDER BY lv.timestamp_col ASC
) AS user_session_id
FROM lagged_lead_views lv
WHERE lv.next_view_timestamp is NULL
OR DATE_PART('epoch', lv.next_view_timestamp - lv.timestamp_col) > (20 * 60) -- 20 minutes in seconds
)
SELECT ssv.user_id
, ssv.session_start_timestamp
, sev.session_end_timestamp
FROM session_start_views ssv
INNER JOIN session_end_views sev ON
ssv.user_id = sev.user_id
AND ssv.user_session_id = sev.user_session_id
ORDER BY 2 ASC;
@devbww
23:40:16.4064
is what postgres returns.