Skip to content

Instantly share code, notes, and snippets.

@webdevilopers
Last active August 29, 2015 14:05
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save webdevilopers/11b695454d816f53b314 to your computer and use it in GitHub Desktop.
Save webdevilopers/11b695454d816f53b314 to your computer and use it in GitHub Desktop.
MySQL perfomance impact on COUNT DISTINCT when adding subquery to WHERE condition with 20000 rows in parent table
SELECT COUNT(DISTINCT a.angebot_id),
ash.angebot_status_datum,
(
SELECT MAX(ash2.angebot_status_datum)
FROM angebot_status_historie ash2
WHERE ash2.angebot_id = a.angebot_id
) AS current_state
FROM
angebot a
JOIN angebot_status_historie ash USING (angebot_id)
HAVING ash.angebot_status_datum = current_state
# duration 0,094 sec
SELECT COUNT(DISTINCT a.angebot_id)
FROM
angebot a
JOIN angebot_status_historie ash USING (angebot_id)
# duration 0,062 sec
SELECT COUNT(DISTINCT a.angebot_id)
FROM
angebot a
JOIN angebot_status_historie ash USING (angebot_id)
WHERE ash.angebot_status_datum = (
SELECT MAX(ash2.angebot_status_datum)
FROM angebot_status_historie ash2
WHERE ash2.angebot_id = a.angebot_id
)
# timeout
@webdevilopers
Copy link
Author

Alternativley I would use a HAVING clause and select the MAX() before. Unfortunately I am using a generator that has to reset the SELECT clause.

Is there a workaround?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment