gists | |||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|
|
Now that we've merged dandelion#24244 we set to test its performance in production. We isolate a set of channels with different magnitudes of amount of feedback records, and test its performance with this ruby snippet:
def my_test(organization_id:, channel_id:, publication_date_begin: 10.years.ago, publication_date_end: 1.day.ago, limit: 100, offset: 0)
RailsMultitenant[:organization_id] = organization_id
starting = Process.clock_gettime(Process::CLOCK_MONOTONIC)
list = TargetSchemas::ChannelFeedback::Search.new(
channel_id: channel_id,
publication_date_begin:,
publication_date_end:,
limit: 10,
offset: 0
).run_search!
list.records.first
ending = Process.clock_gettime(Process::CLOCK_MONOTONIC)
[list.total, (ending - starting).seconds.round(2)]
end
Unfortunately the results were not satisfactory, taking almost 3 minutes to complete in the worst possible case.
organization_id | channel_id | total_records | total_time |
---|---|---|---|
741 | 43305 | 4000 | 1.27 |
17381 | 327533 | 11099 | 4.02 |
18320 | 381547 | 50507 | 8.32 |
17235 | 300745 | 107728 | 14.21 |
16889 | 372243 | 669528 | 174.36 π |
Let's analyse the worst case scenario for org 168889
EXPLAIN ANALYZE the record count:
D, [2024-03-01T17:59:39.295840 #1] DEBUG -- : SQL (46558.8ms) EXPLAIN ANALYZE SELECT COUNT(*) FROM (SELECT 1 AS one FROM "target_schema_unified_feedback" INNER JOIN "export_runs" ON "export_runs"."organization_id" = 16889 AND "export_runs"."external_id" = "target_schema_unified_feedback"."publication_id" WHERE "target_schema_unified_feedback"."organization_id" = 16889 AND "export_runs"."created_at" BETWEEN '2014-03-01 17:14:41.628721' AND '2024-02-29 17:14:41.628844' AND "export_runs"."status" = 'completed' AND "target_schema_unified_feedback"."channel_id" = 372243 AND "target_schema_unified_feedback"."dismissed" = FALSE) subquery_for_count
Aggregate (cost=12952.71..12952.72 rows=1 width=8) (actual time=46556.142..46556.144 rows=1 loops=1)
-> Hash Join (cost=7835.28..12952.70 rows=4 width=0) (actual time=4270.297..46501.380 rows=669528 loops=1)
Hash Cond: (((tspf.publication_id)::uuid) = export_runs.external_id)
-> Append (cost=0.56..5014.86 rows=27488 width=16) (actual time=0.032..42058.130 rows=669545 loops=1)
-> Index Only Scan using index_active_product_feedback_on_org_channel_pub_product on target_schema_product_feedback tspf (cost=0.56..4874.76 rows=27487 width=16) (actual time=0.032..41993.085 rows=669545 loops=1)
Index Cond: ((organization_id = 16889) AND (channel_id = 372243))
Heap Fetches: 135111
-> Index Scan using index_field_feedback_on_org_channel_product_field on target_schema_field_feedback tsff (cost=0.43..2.66 rows=1 width=16) (actual time=0.928..0.928 rows=0 loops=1)
Index Cond: ((organization_id = 16889) AND (channel_id = 372243))
Filter: (NOT dismissed)
-> Hash (cost=7746.66..7746.66 rows=7045 width=16) (actual time=4270.238..4270.239 rows=7862 loops=1)
Buckets: 8192 Batches: 1 Memory Usage: 433kB
-> Index Scan using index_export_runs_by_org_id_status_created_at_desc_order on export_runs (cost=0.56..7746.66 rows=7045 width=16) (actual time=2.304..4263.157 rows=7862 loops=1)
Index Cond: ((organization_id = 16889) AND ((status)::text = 'completed'::text) AND (created_at >= '2014-03-01 17:14:41.628721'::timestamp without time zone) AND (created_at <= '2024-02-29 17:14:41.628844'::timestamp without time zone))
Planning Time: 0.975 ms
Execution Time: 46556.224 ms
EXPLAIN ANALYZE for the full query (includes some left joins to fetch associated records)
=> "SELECT \"target_schema_unified_feedback\".\"product_feedback_id\" AS t0_r0, \"target_schema_unified_feedback\".\"field_feedback_id\" AS t0_r1, \"target_schema_unified_feedb...
dandelion(prod)(read-only):004:0> puts ActiveRecord::Base.connection.exec_query("EXPLAIN ANALYZE #{sql}").rows.join("\n");1
Limit (cost=38524.24..38524.24 rows=1 width=4245) (actual time=210540.056..210540.061 rows=1 loops=1)
-> Sort (cost=38524.24..38524.25 rows=4 width=4245) (actual time=210540.056..210540.060 rows=1 loops=1)
Sort Key: export_runs.created_at DESC
Sort Method: top-N heapsort Memory: 29kB
-> Nested Loop Left Join (cost=7836.85..38524.22 rows=4 width=4245) (actual time=4480.898..209365.943 rows=669528 loops=1)
-> Nested Loop Left Join (cost=7836.41..38514.50 rows=4 width=3990) (actual time=4480.885..208798.307 rows=669528 loops=1)
-> Nested Loop Left Join (cost=7835.85..38503.80 rows=4 width=3503) (actual time=4479.974..196133.862 rows=669528 loops=1)
-> Hash Join (cost=7835.28..38492.68 rows=4 width=2180) (actual time=4479.495..191670.024 rows=669528 loops=1)
Hash Cond: (((tspf.publication_id)::uuid) = export_runs.external_id)
-> Append (cost=0.56..30554.84 rows=27488 width=690) (actual time=1.335..186553.167 rows=669545 loops=1)
-> Index Scan using index_active_product_feedback_on_org_channel_pub_product on target_schema_product_feedback tspf (cost=0.56..30414.74 rows=27487 width=690) (actual time=1.334..186431.707 rows=669545 loops=1)
Index Cond: ((organization_id = 16889) AND (channel_id = 372243))
-> Index Scan using index_field_feedback_on_org_channel_product_field on target_schema_field_feedback tsff (cost=0.43..2.66 rows=1 width=393) (actual time=0.833..0.833 rows=0 loops=1)
Index Cond: ((organization_id = 16889) AND (channel_id = 372243))
Filter: (NOT dismissed)
-> Hash (cost=7746.66..7746.66 rows=7045 width=1490) (actual time=4478.109..4478.110 rows=7862 loops=1)
Buckets: 8192 Batches: 1 Memory Usage: 12841kB
-> Index Scan using index_export_runs_by_org_id_status_created_at_desc_order on export_runs (cost=0.56..7746.66 rows=7045 width=1490) (actual time=2.031..4455.054 rows=7862 loops=1)
Index Cond: ((organization_id = 16889) AND ((status)::text = 'completed'::text) AND (created_at >= '2014-03-01 17:14:41.628721'::timestamp without time zone) AND (created_at <= '2024-02-29 17:14:41.628844'::timestamp without time zone))
-> Index Scan using products_pkey on products (cost=0.57..2.78 rows=1 width=1323) (actual time=0.006..0.006 rows=1 loops=669528)
Index Cond: (id = tspf.product_id)
Filter: ((destroyed_at IS NULL) AND (NOT pending) AND (space_id IS NULL) AND (organization_id = 16889))
-> Index Scan using target_schema_product_feedback_pkey on target_schema_product_feedback (cost=0.56..2.67 rows=1 width=487) (actual time=0.018..0.018 rows=1 loops=669528)
Index Cond: (id = tspf.id)
Filter: (organization_id = 16889)
-> Index Scan using target_schema_field_feedback_pkey on target_schema_field_feedback (cost=0.43..2.43 rows=1 width=255) (actual time=0.000..0.000 rows=0 loops=669528)
Index Cond: (id = (NULL::bigint))
Filter: (organization_id = 16889)
Planning Time: 4.896 ms
Execution Time: 210540.707 ms
The main problem seems to be on the target_schema_field_feedback
rows which returns around 600k rows (there are 0 field feedback records) which we then need to join with export_runs
to only then reduce to the rows we actually want, by date.
Adding a publication_date
to the feedback tables with an index would allow us to retrieve the rows we need and eliminate the join, making the query faster.