This is a description of a scheme I've used a few times for representing time series data on a fairly small scale. In a nutshell, it uses a single table to house multiple data series on multiple scales and is designed to make it easy to display those series.
series | timescale | timestamp | value |
---|---|---|---|
active-users | weekly | 2016-03-05 | 487 |
active-users | daily | 2016-03-05 | 75 |
active-users | daily | 2016-03-06 | 43 |
active-users | daily | 2016-03-07 | 51 |
active-users | daily | 2016-03-08 | 64 |
active-users | daily | 2016-03-09 | 77 |
active-users | daily | 2016-03-10 | 103 |
active-users | daily | 2016-03-11 | 74 |
... | ... | ... | ... |
Or in SQL (I'm using Postgres):
CREATE TABLE series_rollups (
series VARCHAR(256),
timescale VARCHAR(20),
timestamp TIMESTAMP,
value INTEGER
);
INSERT INTO series_rollups (series, timescale, timestamp, value) VALUES
('active-users', 'weekly', '2016-03-05', 487),
('active-users', 'daily', '2016-03-05', 75),
('active-users', 'daily', '2016-03-06', 43),
('active-users', 'daily', '2016-03-07', 51),
('active-users', 'daily', '2016-03-08', 64),
('active-users', 'daily', '2016-03-09', 77),
('active-users', 'daily', '2016-03-10', 103),
('active-users', 'daily', '2016-03-11', 74);
The basic query form is:
SELECT timestamp, value
FROM series_rollups
WHERE 1=1
AND series = 'active-users'
AND timescale = 'daily'
AND timestamp >= '2016-03-05'
AND timestamp < '2016-03-12'
ORDER BY timestamp ASC
which can be modified to select data points from any series/timescale/date range. For all new-users
by week in 2016, we can change the query to:
SELECT timestamp, value
FROM series_rollups
WHERE 1=1
AND series = 'new-users'
AND timescale = 'weekly'
AND timestamp >= '2016-01-01'
AND timestamp < '2017-01-01'
ORDER BY timestamp ASC
- Timestamp representation - TODO
- Time period selection for display - TODO
- Tagging and subsets - TODO