Navigation Menu

Skip to content

Instantly share code, notes, and snippets.

@bennadel
Created September 16, 2020 11:07
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save bennadel/c94f261d234ae5e0663de0087f0d47fe to your computer and use it in GitHub Desktop.
Save bennadel/c94f261d234ae5e0663de0087f0d47fe to your computer and use it in GitHub Desktop.
Using Derived Tables To Generate "Stats" For An Outer Query JOIN In MySQL 5.6.49
SELECT
( u.id ) AS user_id,
( u.name ) AS user_name,
-- Get activity logs stats for this user.
( activityLogStats.logCount ) AS log_count,
( activityLogStats.maxCreatedAt ) AS log_createdAt
FROM
user u
INNER JOIN
-- Create a derived table that contains statistics about log-activity for the set of
-- users in a way that we can JOIN to this table using the USER ID. This allows us to
-- calculate this data ONCE FOR THE QUERY rather than ONCE PER USER.
-- --
-- NOTE: In a production setting, I would almost certainly be limiting the scope of
-- derived table in some way, using an indexed value. Otherwise, this would be a
-- full-table scan, which would likely be catastrophic for performance.
(
SELECT
l.userID,
COUNT( * ) AS logCount,
MAX( l.createdAt ) AS maxCreatedAt
FROM
activity_log l
GROUP BY
l.userID
) AS activityLogStats
ON
activityLogStats.userID = u.id
ORDER BY
u.id ASC
;
SELECT
( u.id ) AS user_id,
( u.name ) AS user_name,
-- Get activity logs stats for this user.
( activityLogStats.logCount ) AS log_count,
( activityLogStats.maxCreatedAt ) AS log_createdAt
FROM
organization_membership om
INNER JOIN
user u
ON
(
om.organizationID = 4 -- Limiting the outer-query based on organization.
AND
u.id = om.userID
)
INNER JOIN
(
SELECT
_l.userID,
COUNT( * ) AS logCount,
MAX( _l.createdAt ) AS maxCreatedAt
FROM
organization_membership _om
INNER JOIN
activity_log _l
ON
-- Notice that our derived table is both repeating the organization-based
-- filtering as well as adding additional filtering based on date (assumes
-- that the createdAt date is part of the indexing). This minimizes the
-- number of rows that have to be read in order to derive this table.
(
_om.organizationID = 4
AND
_l.userID = _om.userID
AND
_l.createdAt >= DATE_ADD( UTC_TIMESTAMP(), INTERVAL -30 DAY )
)
GROUP BY
_l.userID
) AS activityLogStats
ON
activityLogStats.userID = u.id
ORDER BY
u.id ASC
;
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment