Skip to content

Instantly share code, notes, and snippets.

@bensheldon
Last active July 23, 2024 17:02
Show Gist options
  • Save bensheldon/4054feed291e8ae7f54d40b14ad7ba79 to your computer and use it in GitHub Desktop.
Save bensheldon/4054feed291e8ae7f54d40b14ad7ba79 to your computer and use it in GitHub Desktop.

In my application every time I send a newsletter to a User, I create a NewsletterDelivery record. I frequently want to be able to query, for each user, what is the most recent newsletter delivery record. This is called a "last n per group" query and a LATERAL JOIN is the best way to do it imo. But the query I've been using (and I've told people to use, and seen blogged about) is not very good, because the conditions don't get pushed down into the subselect which means that the query ends-up lateral-joining all the records before it applies the conditions for the association.

Instead of doing subselect_table.* the better query does association.id AS assocation_id, subselect_table.id, subselect_table.title, .... and enumerates over all of the columns. This allows the association query, which Active Record tacks on at the end as WHERE association_id = $1 or WHERE association_id IN ($1, $2, $3, ...) to be pushed down correctly.

Old query

ROWS=8865 🥵

SELECT "newsletter_deliveries".* 
FROM (
    SELECT latest.* 
    FROM "users" 
    INNER JOIN LATERAL (
        SELECT "newsletter_deliveries".* 
	FROM "newsletter_deliveries" 
	WHERE (user_id = users.id) 
	ORDER BY "newsletter_deliveries"."created_at" DESC 
	LIMIT 1
    ) AS latest ON TRUE
) newsletter_deliveries 
WHERE "newsletter_deliveries"."user_id" IN (1, 2)

Nested Loop  (cost=0.72..13168.14 rows=8865 width=80)
  ->  Index Only Scan using users_pkey on users  (cost=0.29..336.26 rows=8865 width=4)
  ->  Subquery Scan on latest  (cost=0.43..1.44 rows=1 width=80)
        Filter: (latest.user_id = ANY ('{1,2}'::bigint[]))
        ->  Limit  (cost=0.43..1.42 rows=1 width=80)
              ->  Index Scan using index_newsletter_deliveries_on_user_id_and_created_at on newsletter_deliveries  (cost=0.43..1444.49 rows=1455 width=80)
                    Index Cond: (user_id = users.id)

New query

ROWS=2 🎉

SELECT * FROM (
    SELECT "users"."id" AS user_id, "latest"."id", "latest"."created_at", "latest"."updated_at", "latest"."delivered_at", "latest"."opened_at", "latest"."clicked_at", "latest"."spammed_at", "latest"."uuid"
    FROM "users" 
    INNER JOIN LATERAL (
        SELECT "newsletter_deliveries".* 
        FROM "newsletter_deliveries" 
        WHERE (user_id = users.id) 
        ORDER BY "newsletter_deliveries"."created_at" DESC 
        LIMIT 1
    ) AS latest ON TRUE
) newsletter_deliveries 
WHERE "newsletter_deliveries"."user_id" IN (1, 2)

Nested Loop  (cost=0.72..7.17 rows=2 width=76)
  ->  Index Only Scan using users_pkey on users  (cost=0.29..4.28 rows=2 width=4)
        Index Cond: (id = ANY ('{1,2}'::integer[]))
  ->  Limit  (cost=0.43..1.42 rows=1 width=80)
        ->  Index Scan using index_newsletter_deliveries_on_user_id_and_created_at on newsletter_deliveries  (cost=0.43..1444.49 rows=1455 width=80)
              Index Cond: (user_id = users.id)

In Arel

class NewsletterDelivery < ApplicationRecord
  belongs_to :user

  scope :recent, lambda { |count = 1|
    latest_table = Arel::Table.new('latest')
    recent_newsletters = User
                         .select(User.arel_table["id"].as("user_id"))
                         .select(NewsletterDelivery.column_names.without("user_id").map { |column| latest_table[column] })
                         .arel.join(
                           NewsletterDelivery
                             .where(arel_table["user_id"].eq(User.arel_table["id"]))
                             .order(created_at: :desc)
                             .limit(count)
                             .arel.lateral(latest_table.name)
                         ).on("TRUE")
    from(recent_newsletters.as(arel_table.name))
  }
end

class User < ApplicationRecord
  has_many :newsletter_deliveries
  has_one :recent_newsletter_delivery, -> { recent(1) }, class_name: "NewsletterDelivery"
end
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment