Skip to content

Instantly share code, notes, and snippets.

@slpsys
Last active May 4, 2018 20:03
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save slpsys/cb575f3b5802f3a81de394732cba6727 to your computer and use it in GitHub Desktop.
Save slpsys/cb575f3b5802f3a81de394732cba6727 to your computer and use it in GitHub Desktop.
Query large (20GB on-cluster, for us) Redshift tables that are missing sort keys or are not diststyle key
SELECT
TRIM(pgn.nspname) AS schema,
TRIM(a.name) AS table,
CASE WHEN pgc.reldiststyle != 1 AND ti.sortkey_num = 0 THEN 'DISTSTYLE NOT KEY AND NO SORTKEY (TABLE IS GARBAGE)'
WHEN pgc.reldiststyle != 1 THEN 'DISTSTYLE NOT KEY'
WHEN ti.sortkey_num = 0 THEN 'NO SORTKEY'
END as issue
FROM
(
SELECT db_id, id, name, SUM(rows) AS rows
FROM stv_tbl_perm a
GROUP BY db_id, id, name
) AS a
JOIN pg_class AS pgc ON pgc.oid = a.id
JOIN pg_namespace AS pgn ON pgn.oid = pgc.relnamespace
AND pgn.nspowner > 1
JOIN
(
SELECT tbl, COUNT(*) AS mbytes
FROM stv_blocklist
GROUP BY tbl
) b ON a.id = b.tbl
JOIN svv_table_info ti
ON ti.schema = pgn.nspname AND a.name = ti.table
WHERE b.mbytes >= 20480
AND (pgc.reldiststyle != 1 OR ti.sortkey_num = 0)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment