Skip to content

Instantly share code, notes, and snippets.

@freshtonic
Created March 13, 2010 08:41
Show Gist options
  • Save freshtonic/331205 to your computer and use it in GitHub Desktop.
Save freshtonic/331205 to your computer and use it in GitHub Desktop.
-- A Postgres 8.4 query against a table called 'trade_summaries'
-- This query makes use of WINDOW FUNCTIONS to provide access to the previous trading
-- day on the row for the current trading day.
-- The end result of the query is a set of market trading days with the interesting column being
-- the number of standard deviations that the difference in total trading value for the current
-- day versus the previous day is from the average difference between trading days.
-- What makes this query more verbose than it has to be is eacerbated by three limitations of Postgres:
-- (NOTE: there may be a good technical or theoretical 'correctness' reason for these limitations,
-- but I am obviously not smart enough to figure it out!)
-- 1. You can't refer to calculated columns within the same select statement.
-- To get around this you are forced to expand the expression rather than use its alias.
-- 2. When forced to expand an expression that is a window function, you'll find that you can't
-- embed it in an aggregate function (Postgres doesn't allow it).
-- 3. You can't nest window function expressions.
--
-- So to get around the above limitations, I have a quadruple nested query, which is more verbose than
-- I'd like.
--
-- Any Postgres experts like to suggest some improvements?
SELECT
standard_deviations.day_id AS day_id,
standard_deviations.cal_date AS cal_date,
standard_deviations.current_day_metric AS current_day_metric,
standard_deviations.previous_day_metric AS previous_day_metric,
standard_deviations.standard_deviation AS standard_deviation,
standard_deviations.average_difference AS average_difference,
standard_deviations.difference AS difference,
(abs(standard_deviations.difference - standard_deviations.average_difference) / standard_deviations.standard_deviation ) AS count_standard_devations
FROM
(
SELECT
differences.day_id AS day_id,
differences.cal_date AS cal_date,
differences.current_day_metric AS current_day_metric,
differences.previous_day_metric AS previous_day_metric,
stddev(differences.difference) OVER year_window AS standard_deviation,
avg(differences.difference) OVER year_window AS average_difference,
differences.difference AS difference
FROM
(
SELECT
daily_summaries.day_id as day_id,
daily_summaries.cal_date as cal_date,
daily_summaries.metric as current_day_metric,
lag(daily_summaries.metric,1) OVER year_window AS previous_day_metric,
abs(daily_summaries.metric - lag(daily_summaries.metric,1) OVER year_window) AS difference
FROM
(
SELECT
min(days.id) AS day_id,
min(days.cal_date) AS cal_date,
sum(value_open + value_between + value_close) AS metric
FROM
trade_summaries AS ts
INNER JOIN
days ON (ts.day_id = days.id)
WHERE
days.is_trading_day = true
AND days.year = 2007
GROUP BY
days.cal_date
ORDER BY days.cal_date ASC
) AS daily_summaries
WINDOW
year_window as (PARTITION BY extract(year FROM cal_date) ORDER BY extract(year FROM cal_date) ASC)
) AS differences
WINDOW
year_window as (PARTITION BY extract(year FROM cal_date) ORDER BY extract(year FROM cal_date) ASC)
) AS standard_deviations
WINDOW
year_window as (PARTITION BY extract(year FROM cal_date) ORDER BY extract(year FROM cal_date) ASC)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment