Example scenario: Need to get all encounter rows and summary information about their labs such as min/max, and the value most immediately before and after an event of interest.
The following code will setup an example scenario. Some people have no labs, others only have labs before or only after the event of interest.
CREATE OR REPLACE TABLE curation.test_encounter
(
person_id INT64 NOT NULL,
case_id INT64 NOT NULL,
encounter_datetime DATETIME NOT NULL,
other_info STRING
);
INSERT INTO curation.test_encounter VALUES (1, 1, '2012-12-12 17:00:00', 'blah');
INSERT INTO curation.test_encounter VALUES (1, 2, '2013-12-12 17:00:00', 'foo');
INSERT INTO curation.test_encounter VALUES (2, 3, '2014-12-12 17:00:00', 'bar');
INSERT INTO curation.test_encounter VALUES (3, 4, '2015-12-12 17:00:00', 'baz');
INSERT INTO curation.test_encounter VALUES (4, 5, '2016-12-12 17:00:00', 'tmp');
CREATE OR REPLACE TABLE curation.test_lab
(
person_id INT64 NOT NULL,
case_id INT64 NOT NULL,
numeric_result FLOAT64,
lab_datetime DATETIME,
lab_details STRING
);
INSERT INTO curation.test_lab VALUES (1, 1, 1.15, '2012-12-09 17:00:00', 'blood');
INSERT INTO curation.test_lab VALUES (1, 1, 1.20, '2012-12-10 17:00:00', 'blood');
INSERT INTO curation.test_lab VALUES (1, 1, 1.25, '2012-12-11 17:00:00', 'blood');
INSERT INTO curation.test_lab VALUES (1, 2, 1.35, '2013-12-13 17:00:00', 'phlegm');
INSERT INTO curation.test_lab VALUES (2, 3, 1.40, '2014-12-11 17:00:00', 'ybile');
INSERT INTO curation.test_lab VALUES (2, 3, 1.45, '2014-12-11 20:00:00', 'ybile');
INSERT INTO curation.test_lab VALUES (2, 3, 1.55, '2014-12-13 17:00:00', 'bbile');
INSERT INTO curation.test_lab VALUES (3, 4, 1.65, '2015-12-13 17:00:00', 'blood');
Here's a query that gets part of the information:
SELECT
e.person_id,
e.case_id,
MAX(numeric_result) AS max_result,
MIN(numeric_result) AS min_result,
'?' AS result_before_encounter, -- how to get this by group?
'?' AS result_after_encounter, -- how to get this by group?
COUNT(DISTINCT lab_datetime) AS num_result
FROM curation.test_encounter e
LEFT JOIN curation.test_lab l
ON e.case_id = l.case_id
GROUP BY e.person_id, e.case_id
case_id | person_id | max_result | min_result | num_results | result_before | result_after |
---|---|---|---|---|---|---|
1 | 1 | 1.25 | 1.15 | 3 | ? | ? |
2 | 1 | 1.35 | 1.35 | 1 | ? | ? |
3 | 2 | 1.55 | 1.4 | 3 | ? | ? |
4 | 3 | 1.65 | 1.65 | 1 | ? | ? |
5 | 4 | NULL | NULL | 0 | ? | ? |
How do I go from the above to what I really want?
case_id | person_id | max_result | min_result | num_results | result_before | result_after |
---|---|---|---|---|---|---|
1 | 1 | 1.25 | 1.15 | 3 | 1.25 | NULL |
2 | 1 | 1.35 | 1.35 | 1 | NULL | 1.35 |
3 | 2 | 1.55 | 1.4 | 3 | 1.45 | 1.55 |
4 | 3 | 1.65 | 1.65 | 1 | NULL | 1.65 |
5 | 4 | NULL | NULL | 0 | NULL | NULL |