Skip to content

Instantly share code, notes, and snippets.

@dlangille
Last active September 26, 2018 18:49
Show Gist options
  • Save dlangille/dc9895d13a38cf080afef7ac778145f2 to your computer and use it in GitHub Desktop.
Save dlangille/dc9895d13a38cf080afef7ac778145f2 to your computer and use it in GitHub Desktop.
Trying to invoke a SRF multiple times, gets real slow
CREATE OR REPLACE FUNCTION port_conflicts4(bigint, text) RETURNS SETOF ports_conflicts_matches AS $$
SELECT DISTINCT PC.id, CLP.port_id
FROM commit_log_ports CLP JOIN ports_conflicts PC
ON (PC.id = $1 AND CLP.port_name_revision ~ $2);
$$ LANGUAGE SQL STABLE;
-- it returns a tuple of bigint and integer and it's fast:
freshports.dev=# explain analyse SELECT * FROM port_conflicts4(22, '^mysql[0-9].*\-server\-.*$');
QUERY PLAN
-------------------------------------------------------------------------------------------------------------------------------------------------------------------
Unique (cost=586.60..587.03 rows=85 width=12) (actual time=7.033..7.106 rows=26 loops=1)
-> Sort (cost=586.60..586.82 rows=85 width=12) (actual time=7.033..7.060 rows=408 loops=1)
Sort Key: clp.port_id
Sort Method: quicksort Memory: 44kB
-> Nested Loop (cost=6.14..583.88 rows=85 width=12) (actual time=1.799..6.902 rows=408 loops=1)
-> Index Only Scan using ports_conflicts_pkey on ports_conflicts pc (cost=0.15..8.17 rows=1 width=8) (actual time=0.007..0.007 rows=1 loops=1)
Index Cond: (id = '22'::bigint)
Heap Fetches: 1
-> Bitmap Heap Scan on commit_log_ports clp (cost=5.99..574.86 rows=85 width=4) (actual time=1.789..6.823 rows=408 loops=1)
Filter: (port_name_revision ~ '^mysql[0-9].*\-server\-.*$'::text)
Rows Removed by Filter: 1612
Heap Blocks: exact=1218
-> Bitmap Index Scan on commit_log_ports_port_name_revision (cost=0.00..5.96 rows=154 width=0) (actual time=0.588..0.588 rows=2020 loops=1)
Index Cond: ((port_name_revision >= 'mysql'::text) AND (port_name_revision < 'mysqm'::text))
Planning time: 0.521 ms
Execution time: 7.137 ms
(16 rows)
freshports.dev=#
freshports.dev=# SELECT * FROM port_conflicts4(22, '^mysql[0-9].*\-server\-.*$');
conflicts_id | port_id
--------------+---------
22 | 16965
22 | 26125
22 | 32408
22 | 37760
22 | 39843
22 | 40654
22 | 41123
22 | 41266
22 | 42249
22 | 42846
22 | 42892
22 | 44547
22 | 45081
22 | 45399
22 | 45987
22 | 46004
22 | 46302
22 | 46670
22 | 46837
22 | 48249
22 | 48566
22 | 48568
22 | 48611
22 | 49699
22 | 49702
22 | 49704
(26 rows)
freshports.dev=#
Given this data, I want to invoke the above function on each of those tuples:
freshports.dev=# SELECT id, '^' || item_value || '$' AS regex_pattern FROM ports_conflicts WHERE port_id = 43574;
id | regex_pattern
----+--------------------------------
21 | ^mysqlwsrep.*$
22 | ^mysql[0-9].*\-server\-.*$
23 | ^mariadb10[013-9]\-server\-.*$
24 | ^percona[0-9].*\-server\-.*$
25 | ^mariadb5.*\-server\-.*$
(5 rows)
freshports.dev=#
freshports.dev=# explain analyse SELECT PC4.*
FROM ports_conflicts PC, LATERAL (SELECT * FROM port_conflicts4(PC.id, '^' || PC.item_value || '$')) PC4;
QUERY PLAN
------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Nested Loop (cost=30152.53..30866438.72 rows=4364580 width=12) (actual time=910.949..4825.029 rows=60 loops=1)
-> Seq Scan on ports_conflicts pc (cost=0.00..20.20 rows=1020 width=40) (actual time=0.010..0.013 rows=5 loops=1)
-> Unique (cost=30152.53..30175.61 rows=4279 width=12) (actual time=964.966..964.998 rows=12 loops=5)
-> Sort (cost=30152.53..30164.07 rows=4616 width=12) (actual time=964.963..964.976 rows=170 loops=5)
Sort Key: clp.port_id
Sort Method: quicksort Memory: 31kB
-> Nested Loop (cost=0.15..29871.60 rows=4616 width=12) (actual time=515.133..964.844 rows=170 loops=5)
-> Index Only Scan using ports_conflicts_pkey on ports_conflicts pc_1 (cost=0.15..8.17 rows=1 width=8) (actual time=0.012..0.015 rows=1 loops=5)
Index Cond: (id = pc.id)
Heap Fetches: 5
-> Seq Scan on commit_log_ports clp (cost=0.00..29817.27 rows=4616 width=4) (actual time=515.113..964.766 rows=170 loops=5)
Filter: (port_name_revision ~ (('^'::text || pc.item_value) || '$'::text))
Rows Removed by Filter: 923091
Planning time: 0.286 ms
Execution time: 4825.074 ms
(15 rows)
freshports.dev=#
Yeah, that's not good.
By creating this function, a huge speed increase is obtained:
CREATE OR REPLACE FUNCTION port_conflicts5(bigint, text) RETURNS SETOF ports_conflicts_matches AS $$
DECLARE
a_ports_conflicts_id ALIAS FOR $1;
a_regex ALIAS FOR $2;
l_regex text;
r ports_conflicts_matches%rowtype;
BEGIN
l_regex := '^(' || a_regex || ')';
FOR r in SELECT distinct a_ports_conflicts_id, port_id FROM commit_log_ports WHERE port_name_revision ~ l_regex
LOOP
RETURN NEXT r;
END LOOP;
END;
$$ LANGUAGE 'plpgsql' STABLE;
freshports.dev=# explain analyse SELECT PC4.*
freshports.dev-# FROM ports_conflicts PC, LATERAL (SELECT * FROM port_conflicts5(PC.id, '^' || PC.item_value || '$')) PC4;
QUERY PLAN
--------------------------------------------------------------------------------------------------------------------------
Nested Loop (cost=0.26..20420.46 rows=1020000 width=12) (actual time=0.674..13.149 rows=60 loops=1)
-> Seq Scan on ports_conflicts pc (cost=0.00..20.20 rows=1020 width=40) (actual time=0.010..0.012 rows=5 loops=1)
-> Function Scan on port_conflicts5 (cost=0.26..10.26 rows=1000 width=12) (actual time=2.622..2.623 rows=12 loops=5)
Planning time: 0.101 ms
Execution time: 13.180 ms
(5 rows)
freshports.dev=# SELECT PC4.*
FROM ports_conflicts PC, LATERAL (SELECT * FROM port_conflicts5(PC.id, '^' || PC.item_value || '$')) PC4;
conflicts_id | port_id
--------------+---------
21 | 41490
21 | 44391
22 | 16965
22 | 26125
22 | 32408
22 | 37760
22 | 39843
22 | 40654
22 | 41123
22 | 41266
22 | 42249
22 | 42846
22 | 42892
22 | 44547
22 | 45081
22 | 45399
22 | 45987
22 | 46004
22 | 46302
22 | 46670
22 | 46837
22 | 48249
22 | 48566
22 | 48568
22 | 48611
22 | 49699
22 | 49702
22 | 49704
23 | 38042
23 | 44644
23 | 40421
23 | 48630
23 | 39708
23 | 40520
23 | 48881
23 | 46760
23 | 49666
23 | 35631
23 | 44827
23 | 42376
23 | 45526
23 | 49671
23 | 42255
23 | 46955
24 | 46502
24 | 44539
24 | 47245
24 | 44540
24 | 31147
24 | 42259
24 | 40750
24 | 32558
25 | 46699
25 | 32193
25 | 40409
25 | 28808
25 | 49668
25 | 45440
25 | 48244
25 | 42383
(60 rows)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment