Skip to content

Instantly share code, notes, and snippets.

@greenlion
Last active October 21, 2021 17:30
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save greenlion/58b9e9f098672b40ffcb906be2530334 to your computer and use it in GitHub Desktop.
Save greenlion/58b9e9f098672b40ffcb906be2530334 to your computer and use it in GitHub Desktop.
star schema optimization!
mysql>
-- This will force the storage engine to examine only a single partition.
set warp_partition_filter='lineorder: p0';
mysql> explain
select straight_join count(*)
from lineorder
join dim_date on lo_orderdatekey = d_datekey
join customer on lo_custkey = c_customerkey
join supplier on lo_suppkey = s_suppkey
join part on lo_partkey = p_partkey
where c_region = 'AMERICA'
and s_region = 'AMERICA'
and (p_mfgr = 'MFGR#1' or p_mfgr = 'MFGR#2') \G
*************************** 1. row ***************************
id: 1
select_type: SIMPLE
table: lineorder
partitions: NULL
type: ALL
possible_keys: NULL
key: NULL
key_len: NULL
ref: NULL
rows: 10000000
filtered: 100.00
Extra: NULL
*************************** 2. row ***************************
id: 1
select_type: SIMPLE
table: dim_date
partitions: NULL
type: ALL
possible_keys: NULL
key: NULL
key_len: NULL
ref: NULL
rows: 2556
filtered: 10.00
Extra: Using where; Using join buffer (hash join)
*************************** 3. row ***************************
id: 1
select_type: SIMPLE
table: customer
partitions: NULL
type: ALL
possible_keys: NULL
key: NULL
key_len: NULL
ref: NULL
rows: 6000000
filtered: 1.00
Extra: Using where; Using join buffer (hash join)
*************************** 4. row ***************************
id: 1
select_type: SIMPLE
table: supplier
partitions: NULL
type: ALL
possible_keys: NULL
key: NULL
key_len: NULL
ref: NULL
rows: 400000
filtered: 1.00
Extra: Using where; Using join buffer (hash join)
*************************** 5. row ***************************
id: 1
select_type: SIMPLE
table: part
partitions: NULL
type: ALL
possible_keys: NULL
key: NULL
key_len: NULL
ref: NULL
rows: 1600000
filtered: 1.90
Extra: Using where; Using join buffer (hash join)
5 rows in set, 1 warning (0.37 sec)
| -> Aggregate: count(0) (actual time=2673.453..2673.453 rows=1 loops=1)
-> Inner hash join (part.P_PartKey = lineorder.LO_PartKey) (cost=27809448678227874000.00 rows=**4744135967067528192**) (actual time=2528.249..2673.360 rows=**1567** loops=1)
-> Filter: ((part.P_MFGR = 'MFGR#1') or (part.P_MFGR = 'MFGR#2')) (cost=0.03 rows=30400) (actual time=0.021..135.091 rows=135913 loops=1)
-> Table scan on part (cost=0.03 rows=1600000) (actual time=0.019..108.078 rows=135913 loops=1)
-> Hash
-> Inner hash join (supplier.S_SuppKey = lineorder.LO_SuppKey) (cost=84307466909803060.00 rows=8213532327154137) (actual time=2447.927..2527.103 rows=5758 loops=1)
-> Filter: (supplier.S_Region = 'AMERICA') (cost=0.01 rows=4000) (actual time=0.020..72.939 rows=79467 loops=1)
-> Table scan on supplier (cost=0.01 rows=**400000**) (actual time=0.019..60.330 rows=**79467** loops=1)
-> Hash
-> Inner hash join (customer.C_CustomerKey = lineorder.LO_CustKey) (cost=2053736025816746.00 rows=205338279110954) (actual time=2392.579..2446.978 rows=5790 loops=1)
-> Filter: (customer.C_Region = 'AMERICA') (cost=0.03 rows=60000) (actual time=0.029..50.057 rows=48322 loops=1)
-> Table scan on customer (cost=0.03 rows=**6000000**) (actual time=0.027..42.789 rows=**48322** loops=1)
-> Hash
-> Inner hash join (dim_date.D_DateKey = lineorder.LO_OrderDateKey) (cost=342364304601.11 rows=342230416738) (actual time=2361.986..2373.074 rows=159918 loops=1)
-> Table scan on dim_date (cost=0.00 rows=2556) (actual time=0.437..1.414 rows=2556 loops=1)
-> Hash
-> Table scan on lineorder (cost=9999999.30 rows=**10000000**) (actual time=**1979.563..2346.278** rows=**159918** loops=1)
1 row in set (2.64 sec)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment