Skip to content

Instantly share code, notes, and snippets.

@godfat
Last active December 28, 2015 05:09
Show Gist options
  • Save godfat/7447806 to your computer and use it in GitHub Desktop.
Save godfat/7447806 to your computer and use it in GitHub Desktop.

I am aware that they are slightly different, but I am too lazy to make the one in ActiveRecord do the same thing as Sequel. In general it seems Sequel would take the advantage of subqueries, but ActiveRecord might not.

In this case, it seems Sequel's SQL is faster than the one generated by ActiveRecord.

Sequel model:

class User < Sequel::Model(Sequel.connect(DATABASE_CONFIG))
  one_to_many :issues, :class => class_name(:Question),
    :dataset => lambda{
      # note that eager_graph would mean a join, but
      # eager would only load it via another SQL with in ([id])
      Question.eager(:user).eager_graph(:categories).
        where(:categories => categories_dataset).
        where(:state => 'created').
        where{questions__created_at > Time.now - 86400*2}.
        reverse_order(:questions__created_at).distinct
    }
end

User.first.issues

Generated:

SELECT DISTINCT "questions"."id", "questions"."created_at", "questions"."updated_at", "questions"."user_id", "questions"."mentor_id", "questions"."state", "questions"."title", "questions"."body", "categories"."id" AS "categories_id", "categories"."created_at" AS "categories_created_at", "categories"."updated_at" AS "categories_updated_at", "categories"."name" FROM "questions" LEFT OUTER JOIN "categories_questions" ON ("categories_questions"."question_id" = "questions"."id") LEFT OUTER JOIN "categories" ON ("categories"."id" = "categories_questions"."category_id") WHERE (("questions"."id" IN (SELECT "categories_questions"."question_id" FROM "categories_questions" WHERE (("categories_questions"."category_id" IN (SELECT "categories"."id" FROM "categories" INNER JOIN "expertises" ON (("expertises"."category_id" = "categories"."id") AND ("expertises"."user_id" = 1)) WHERE ("categories"."id" IS NOT NULL))) AND ("categories_questions"."question_id" IS NOT NULL)))) AND ("state" = 'created') AND ("questions"."created_at" > '2013-11-11 11:19:20.130656+0000')) ORDER BY "questions"."created_at" DESC

ActiveRecord model:

user = User.first
Question.includes(:categories, :user).
  where('categories.id' => user.categories).
  where("questions.state = 'created' AND questions.created_at > ?", 2.days.ago).
  order('questions.created_at DESC').uniq

Generated:

user.categories:

SELECT "categories".* FROM "categories" INNER JOIN "expertises" ON "categories"."id" = "expertises"."category_id" WHERE "expertises"."user_id" = 1

And the rest:

SELECT DISTINCT "questions"."id" AS t0_r0, "questions"."title" AS t0_r1, "questions"."body" AS t0_r2, "questions"."user_id" AS t0_r3, "questions"."created_at" AS t0_r4, "questions"."updated_at" AS t0_r5, "questions"."state" AS t0_r6, "questions"."mentor_id" AS t0_r7, "categories"."id" AS t1_r0, "categories"."name" AS t1_r1, "categories"."created_at" AS t1_r2, "categories"."updated_at" AS t1_r3, "users"."id" AS t2_r0, "users"."email" AS t2_r1 FROM "questions" LEFT OUTER JOIN "categories_questions" ON "categories_questions"."question_id" = "questions"."id" LEFT OUTER JOIN "categories" ON "categories"."id" = "categories_questions"."category_id" LEFT OUTER JOIN "users" ON "users"."id" = "questions"."user_id" WHERE "categories"."id" IN (1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29) AND (questions.state = 'created' AND questions.created_at > '2013-11-11 11:27:21.725317') ORDER BY questions.created_at DESC
@matthewrudy
Copy link

Ok, this is a pretty good example of where ActiveRecord isn't so good.
Doing conditional queries over multiple join tables.

In particular you are joining the categories table twice

  • firstly to get acceptable categories for the user
  • secondly to join the category and load it

You're also joining the user table twice

  • firstly to find out the categories for the user
  • secondly to eager load the user for the question

I think you could do most of this through Arel if you needed to.
But I'm glad that Sequel handles this so well.

Is this actual code you want to write though?
This kind of "relevant question for you" seems like more of an ElasticSearch type query to me.

Whereby we can denormalise each question at the point it is created

{
  id: 1,
  category_ids: [1,2,3,4]
  title: "some question"
  user_id: [5,6,7,8]
  state: 'created'
}

and then do a nice filter on the document, with full text search at the same time.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment