Given a table of exams, in which each exam belongs to a "student", we want to derive a new table in which for any given student exam, we see how much the score has changed compared to the student's previous exam.
Here's the table of exams, which is generated by the SQL in sql-make-exams-table.sql](#file-sql-make-exams-table-sql):
exam_id | year | student | score |
---|---|---|---|
14 | 2014 | Alice | 87 |
1 | 2010 | Alice | 76 |
2 | 2011 | Alice | 82 |
3 | 2010 | Bob | 55 |
4 | 2013 | Alice | 92 |
5 | 2009 | Bob | 70 |
6 | 2012 | Dan | 84 |
7 | 2011 | Eric | 34 |
8 | 2016 | Alice | 80 |
9 | 2015 | Dan | 100 |
10 | 2013 | Bob | 42 |
The end goal is to produce a table in which we can do interesting analyses, such as find the student with the biggest improvement, or worst change, etc.:
student | time_period | score_diff | prior_year | prior_score | year | score | prior_id | exam_id |
---|---|---|---|---|---|---|---|---|
Alice | 2014-2016 | -7 | 2014 | 87 | 2016 | 80 | 14 | 8 |
Alice | 2013-2014 | -5 | 2013 | 92 | 2014 | 87 | 4 | 14 |
Alice | 2011-2013 | 10 | 2011 | 82 | 2013 | 92 | 2 | 4 |
Alice | 2010-2011 | 6 | 2010 | 76 | 2011 | 82 | 1 | 2 |
Alice | 2010 | 76 | 1 | |||||
Bob | 2010-2013 | -13 | 2010 | 55 | 2013 | 42 | 3 | 10 |
Bob | 2009-2010 | -15 | 2009 | 70 | 2010 | 55 | 5 | 3 |
Bob | 2009 | 70 | 5 | |||||
Dan | 2012-2015 | 16 | 2012 | 84 | 2015 | 100 | 6 | 9 |
Dan | 2012 | 84 | 6 | |||||
Eric | 2011 | 34 | 7 |
Alice is an easy example -- in 2010, she scored 87 points on exam #1. In 2011, she scored 82 points on exam #2.
If we wanted to record how exam#2 is related to exam#1, we could create a lookup table consisting of 2 columns -- the "current" exam id, and the id of the exam that preceded it, scoped by the student:
exam_id | prior_id |
---|---|
2 | 1 |
This seems like it could be done recursively, but we can also use a correlated query:
SELECT
f.exam_id AS exam_id
, (SELECT exam_id
FROM exams AS g
WHERE
f.student = g.student
AND f.year > g.year
ORDER BY
g.year DESC
LIMIT 1
) AS prior_id
FROM exams AS f;
Which gets us this result:
exam_id | prior_id |
---|---|
14 | 4 |
1 | |
2 | 1 |
3 | 5 |
4 | 2 |
5 | |
6 | |
7 | |
8 | 14 |
9 | 6 |
10 | 3 |
We can make a view/temp table of the previous "lookups" query, but I like just throwing it into a common table expression:
WITH lookups AS
( SELECT
f.exam_id AS exam_id
, (SELECT exam_id
FROM exams AS g
WHERE
f.student = g.student
AND f.year > g.year
ORDER BY
g.year DESC
LIMIT 1
) AS prior_id
FROM exams AS f)
SELECT
ex.student
, px.year || '-' || ex.year
AS time_period
, ex.score - px.score
AS score_diff
, px.year AS prior_year
, px.score AS prior_score
, ex.year
, ex.score
, lookups.prior_id
, ex.exam_id
FROM exams AS ex
INNER JOIN lookups
ON ex.exam_id = lookups.exam_id
LEFT JOIN exams AS px
ON lookups.prior_id = px.exam_id
ORDER BY
ex.student ASC
, ex.year DESC
;