Skip to content

Instantly share code, notes, and snippets.

@sycobuny
Last active July 5, 2016 15:06
Show Gist options
  • Save sycobuny/5918e10567a7aa4134a39e62987d6265 to your computer and use it in GitHub Desktop.
Save sycobuny/5918e10567a7aa4134a39e62987d6265 to your computer and use it in GitHub Desktop.
Calculating a series based on an initial timestamp and a potentially multiple-day-spanning sequence of ordered times

This query is meant to serve as a proof-of-concept for parsing sleep data off of the fitbit API, which returns you a start date, and then a series of times. Given that these data are polled every minute, you can safely assume that, when you cross a date boundary, the next measurement of time will be "smaller" than the last, and we could just break it into two days.

However, in an extreme edge case where either the fitbit is reporting wrong, or someone really is sleeping for more than 24 hours, we need to really break these data down a bit more consistently. This query will do that, and then assign a proper date to the results so that it's a full TIMESTAMP instead of a simple TIME reference.

The rd value is used to calculate what the result should be, for comparison.

Running this should result in a dataset similar to the following:

postgres@[local]:5432/fitbit-dev =# \i example.sql
┌─────┬─────────────────────┬─────────────────────┐
│ idx │    intended_date    │   calculated_date   │
├─────┼─────────────────────┼─────────────────────┤
│   1 │ 2016-06-21 11:52:00 │ 2016-06-21 11:52:00 │
│   2 │ 2016-06-22 00:52:00 │ 2016-06-22 00:52:00 │
│   3 │ 2016-06-22 10:52:00 │ 2016-06-22 10:52:00 │
│   4 │ 2016-06-22 23:52:00 │ 2016-06-22 23:52:00 │
│   5 │ 2016-06-23 00:52:00 │ 2016-06-23 00:52:00 │
└─────┴─────────────────────┴─────────────────────┘
(5 rows)
-- create a simple table for testing
CREATE TEMPORARY TABLE example (
idx INTEGER PRIMARY KEY,
ts TIME NOT NULL,
rd DATE -- "real" date for later comparison
);
-- populate arbitrary times and intended dates
-- (note, in the real world these timestamps would be more predictable)
-- (and closer together)
INSERT INTO example (ts, idx, rd)
VALUES
('11:52:00', 1, '2016-06-21'),
('00:52:00', 2, '2016-06-22'),
('10:52:00', 3, '2016-06-22'),
('11:52:00', 4, '2016-06-22'),
('12:52:00', 5, '2016-06-23');
/**
* This could be done much more cleanly and/or with an eye towards
* efficiency, but this is merely a POC
*/
WITH
st(d) AS (VALUES ('2016-06-21'::date)),
ds AS (
SELECT
*,
(
lag(ts, 1, NULL) OVER a IS NOT NULL AND
lag(ts, 1, NULL) OVER a > ts
)::integer AS new_date
FROM example
WINDOW a AS (ORDER BY idx)
),
rs AS (
SELECT
idx,
rd::timestamp + ts::interval AS intended_date,
(
(SELECT d FROM st LIMIT 1) +
((SUM(new_date) OVER p)::integer::text || ' days')::interval
) + ts::interval AS calculated_date
FROM ds
WINDOW p AS (ORDER BY idx)
)
SELECT *
FROM rs;
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment