Skip to content

Instantly share code, notes, and snippets.

@aheuermann
Last active December 14, 2020 17:16
Show Gist options
  • Save aheuermann/9db65fd594f74d57562cb79147b8f1ce to your computer and use it in GitHub Desktop.
Save aheuermann/9db65fd594f74d57562cb79147b8f1ce to your computer and use it in GitHub Desktop.
Dual-FK Join Example

We do change-data-capture using debezium on our main database. These changes are written to kafka. We have a use-case for something along these lines:

Inputs:

users_teams

What users belong to what teams (users can belong to multiple teams).

  • id String (key)
  • user_id String
  • group_id String

teams_leagues

What leagues do teams belong to (teams can belong to multiple leagues).

  • id String (key)
  • team_id String
  • league_id String

Desired Output

users_leagues

We want to output a cdc-like stream of users and leagues based on the above change-data-capture streams. So as long as there is at least one link between a user -> team -> league the user is considered to belong to the league. If a user is removed from a team and they aren't linked to the league via a different team a tombstone is emitted.

  • user_id String (key)
  • team_id String (key)

If solving this via SQL against a traditional RDMS we would write the following query.

SELECT 
   users_teams.user_id,
   teams_leagues.league_id
FROM users_teams
JOIN teams_leagues on (users_teams.team_id = teams_leagues.team_id)
GROUP BY users_teams.user_id, teams_leagues.league_id

Is there anyway to accomplish this in KSQL?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment