Skip to content

Instantly share code, notes, and snippets.

@laat
Last active May 14, 2021 19:02
Show Gist options
  • Save laat/220b7ca67a2d0b6433214cd4ffcc96e6 to your computer and use it in GitHub Desktop.
Save laat/220b7ca67a2d0b6433214cd4ffcc96e6 to your computer and use it in GitHub Desktop.
Episodes in wrong TV Series.
WITH
external_data
/*
In an external system, we have some episode to series relations.
An episode can be in multiple series.
The SERIES_ID is monotonically increasing database ID.
*/
AS (SELECT 10 SERIES_ID, 1 EPISODE_ID FROM dual UNION ALL
SELECT 2 SERIES_ID, 1 EPISODE_ID FROM dual),
transformed_data
/*
Somewhere along the way, in a materialized view at our end,
the SERIES_ID is cast to NVARCHAR2 (unicode string) because
we store it like that. I do not know why.
*/
AS (SELECT cast (SERIES_ID as NVARCHAR2 (355)) SERIES_ID, EPISODE_ID
FROM external_data)
/*
In our domian, an episode can only be in a single series,
and when it's in multiple series in the source system it has
to be disambiguated.
It looks like the intent by the original dev(s) is to put the
episode into the first series created by "min (SERIES_ID)".
The series id is monotonically increasing.
*/
SELECT EPISODE_ID, cast (min (SERIES_ID) as NVARCHAR2 (355)) SERIES_ID
FROM transformed_data
group by EPISODE_ID;
/*
Episode 1 ends up in series 10. This is the "bug"
This is the worst kind of disambiguation. It's the minimum
series id when it's a string. When strings are sorted
"10" is smaller than "2".
Try to explain to a non technical person, why the episode
does not appear in the series they expected, I dare you.
When we tried to correct this wierdness, by putting it in the
first series created, thosands of episodes was moved to some
other series. Thus, the logic still remains, and will likely
stay that way for a long time.
To future devs, I'm sorry.
*/
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment