Last active
September 15, 2015 21:28
-
-
Save slpsys/5e43d8237fd8aa924015 to your computer and use it in GitHub Desktop.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
warehouse=# CREATE VIEW events.events AS SELECT * FROM events.a UNION SELECT * FROM events.b; | |
CREATE | |
warehouse=# CREATE VIEW events.events AS SELECT * FROM events.a UNION ALL SELECT * FROM events.b; | |
CREATE |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
warehouse=# explain select count(*) from events.events where user_id = 1234567; | |
QUERY PLAN | |
--------------------------------------------------------------------------------------------------------- | |
XN Aggregate (cost=97603805.53..97603805.53 rows=1 width=0) | |
-> XN Subquery Scan events (cost=0.00..97593174.46 rows=4252427 width=0) | |
-> XN Unique (cost=0.00..97550650.19 rows=4252427 width=3373) | |
-> XN Append (cost=0.00..96934048.27 rows=4252427 width=3373) | |
-> XN Subquery Scan "*SELECT* 1" (cost=0.00..86274211.60 rows=5480 width=1320) | |
-> XN Seq Scan on "a" (cost=0.00..86274156.80 rows=5480 width=1320) | |
Filter: (user_id = 1234567) | |
-> XN Subquery Scan "*SELECT* 2" (cost=0.00..10659836.67 rows=4246947 width=3373) | |
-> XN Seq Scan on b (cost=0.00..10617367.20 rows=4246947 width=3373) | |
Filter: (user_id = 1234567) | |
(10 rows) | |
warehouse=# explain select count(*) from events.events2 where user_id = 1234567; | |
QUERY PLAN | |
------------------------------------------------------------------------------------------------ | |
XN Aggregate (cost=96987203.61..96987203.61 rows=1 width=0) | |
-> XN Subquery Scan events2 (cost=0.00..96976572.54 rows=4252427 width=0) | |
-> XN Append (cost=0.00..96934048.27 rows=4252427 width=0) | |
-> XN Subquery Scan "*SELECT* 1" (cost=0.00..86274211.60 rows=5480 width=0) | |
-> XN Seq Scan on "a" (cost=0.00..86274156.80 rows=5480 width=0) | |
Filter: (user_id = 1234567) | |
-> XN Subquery Scan "*SELECT* 2" (cost=0.00..10659836.67 rows=4246947 width=0) | |
-> XN Seq Scan on b (cost=0.00..10617367.20 rows=4246947 width=0) | |
Filter: (user_id = 1234567) | |
(9 rows) |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
warehouse=# select count(*) from events.a where user_id = 1234567; | |
count | |
------- | |
3885 | |
(1 row) | |
Time: 476.454 ms | |
warehouse=# select count(*) from events.events2 where user_id = 1234567; | |
count | |
------- | |
3885 | |
(1 row) | |
Time: 504.726 ms | |
warehouse=# select count(*) from events.events where user_id = 1234567; | |
count | |
------- | |
3885 | |
(1 row) | |
Time: 70111.247 ms |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
warehouse=# explain select name from events.events2 where user_id = 1234567; | |
QUERY PLAN | |
------------------------------------------------------------------------------------------- | |
XN Subquery Scan events2 (cost=0.00..96976572.54 rows=4252427 width=32) | |
-> XN Append (cost=0.00..96934048.27 rows=4252427 width=19) | |
-> XN Subquery Scan "*SELECT* 1" (cost=0.00..86274211.60 rows=5480 width=16) | |
-> XN Seq Scan on "a" (cost=0.00..86274156.80 rows=5480 width=16) | |
Filter: (user_id = 1234567) | |
-> XN Subquery Scan "*SELECT* 2" (cost=0.00..10659836.67 rows=4246947 width=19) | |
-> XN Seq Scan on b (cost=0.00..10617367.20 rows=4246947 width=19) | |
Filter: (user_id = 1234567) | |
(8 rows) | |
Time: 20.512 ms | |
warehouse=# explain select name, page from events.events2 where user_id = 1234567; | |
QUERY PLAN | |
------------------------------------------------------------------------------------------- | |
XN Subquery Scan events2 (cost=0.00..96976572.54 rows=4252427 width=64) | |
-> XN Append (cost=0.00..96934048.27 rows=4252427 width=37) | |
-> XN Subquery Scan "*SELECT* 1" (cost=0.00..86274211.60 rows=5480 width=31) | |
-> XN Seq Scan on "a" (cost=0.00..86274156.80 rows=5480 width=31) | |
Filter: (user_id = 1234567) | |
-> XN Subquery Scan "*SELECT* 2" (cost=0.00..10659836.67 rows=4246947 width=37) | |
-> XN Seq Scan on b (cost=0.00..10617367.20 rows=4246947 width=37) | |
Filter: (user_id = 1234567) |
3_runtimes.sql
shows that the difference between SELECT COUNT(*)
over events2 (with the UNION ALL
) is only ~5% more expensive than querying the constituent table a
by itself--e.g. it likely is not doing a 60-way join (there are 60 columns in the raw tables) before simply filtering on one column and counting the results.
select count(*) from events.events
, however, takes 147x as long to run. Again, there are ~9 billion rows, so that the DISTINCT
becomes very expensive for a large N is unsurprising.
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
The two tables,
a
andb
contain about 8B and 1B rows, respectively;b
contains no rows whereuser_id
is not null.