Skip to content

Instantly share code, notes, and snippets.

@willkoehler
Last active August 29, 2015 14:00
Show Gist options
  • Save willkoehler/11373993 to your computer and use it in GitHub Desktop.
Save willkoehler/11373993 to your computer and use it in GitHub Desktop.
Rails eager loading variations. The pure "Rails" way is ~5x slower than using raw SQL snippets.
class Hospital < ActiveRecord::Base
has_many :user_assignments, :dependent => :destroy
has_many :users, :through => :user_assignments
# 30 hospitals
# There are 36 columns in this table, so we want to use .select() to limit which ones are pulled by the query
end
class UserAssignment < ActiveRecord::Base
belongs_to :user
belongs_to :hospital
# 750 user assignments
end
class User < ActiveRecord::Base
has_many :user_assignments, :dependent => :destroy
has_many :hospitals, :through => :user_assignments
# 160 users
# Ugly, but 5x faster than op2 and op3
# Takes 2ms
scope :grid_fields, -> { joins('LEFT JOIN user_assignments ON user_id = users.id LEFT JOIN hospitals ON hospitals.id = hospital_id').
group('users.id').select([:id, :name, :email, :role, 'min(hospitals.name) as first_hospital_name']) }
# Brings in all 10 user columns and 36 hospital columns. Takes a long time to process all that data.
# select() doesn't help - it's ignored
# Takes 75ms
scope :grid_fields_op2, -> { eager_load(:hospitals) }
# 3 separate queries, User, UserAssignment, and Hospital.
# User load - takes 1ms
# UserAssignment - load takes 2ms
# Hospital load - takes 7ms (selects all 36 columns - not easy to change this)
scope :grid_fields_op3, -> { preload(:hospitals).select([:id, :name, :email, :role]) }
# (needed for op2 and op3)
# Note that when we use this function, the grid render is ~3x slower than just grabbing the raw
# 'first_hospital_name' field from the first query. 100ms vs 30ms grabbing the raw field
# def first_hospital_name
# hospitals.first.try(:name)
# end
end
<table id='users_grid'>
<tr>
<th>Name</th>
<th>Email</th>
<th>Hospital</th>
<th>Role</th>
<th></th>
</tr>
<%= render(@users.grid_fields) || raw('<tr class="message"><td colspan="5">No users found</td></tr>') %>
</table>
<tr>
<%= fixed_width_column 160, user.name %>
<%= fixed_width_column 210, user.email %>
<%= fixed_width_column 210, user.first_hospital_name %>
<%= fixed_width_column 95, user.role_humanized %>
<td width=30><%= link_to 'Edit', edit_user_path(user) %></td>
</tr>
@willkoehler
Copy link
Author

Awesome! grid_fields_arel_ar dropped right into my code and performed as desired. It also works when chained with scopes like.

scope :by_hospital, ->(id) { joins(:user_assignments).where(user_assignments: { hospital_id: id }) }

This is a great tool to add to my repertoire.

Also nice trick with hospitals[:name].minimum.as('first_hospital_name') I did not know about that syntax.

In regards to my desired outcome, I'm almost certainly over optimizing the performance of this app given my requirements. But I want to know what the possibilities are for Rails. Then I'll back it down and find the right balance between convenience/readability and performance.

Thanks for your help.

@phil-monroe
Copy link

I've been looking for a way to clean up the syntax a bit. I would probably push t up into ActiveRecord::Base via a mixin. Not super thrilled with it, but kinda nifty to try out and i do feel it is a bit more readable. I dunno... maybe it is cleaner to just do variable assignment ala users, uas, hospitals = User.arel_table, UserAssignments.arel_table, Hospital.arel_table. What do you think?

I didn't convert the query to fit your models, but the same structure applies: User has_many Backlog through BacklogSharing

  def self.t
    @@arel_tables ||= Hash.new {|h, k| h[k] = k.to_s.camelize.constantize.arel_table }
  end

  scope :min_shared_backlog_names, -> {
    join_tables = t[:user]
      .join(t[:backlog_sharing], Arel::Nodes::OuterJoin).on(t[:user][:id].eq(t[:backlog_sharing][:user_id]))
      .join(t[:backlog], Arel::Nodes::OuterJoin).on(t[:backlog][:id].eq(t[:backlog_sharing][:backlog_id])).join_sources

    joins(join_tables)
      .group(t[:user][:id])
      .select([:id, t[:backlog][:name].minimum.as('bl_name')])
  }

Now to get rid of that Arel::Nodes::OuterJoin... perhaps something like t[:user].join(t[:backlog_sharing], :inner) instead... should just be a simple lookup hash here: https://github.com/rails/arel/blob/master/lib/arel/factory_methods.rb#L18

@willkoehler
Copy link
Author

I agree the syntax needs to be cleaned up some. It's kind of wordy. Your t solution is pretty slick. My only reservation is that it's a bit obscure. Maybe renaming to arel_table_for[:hospital] would help, but then it's really no better than Hospital.arel_table

Another thought I had was moving the left_joins definition out of the scope block, to just below the has_many statements. left_joins is essentially acting like a has_many association in this context. Maybe rename it to hospitals_outer or something like that? Then the layout of the scope and associations are similar to what you would have with the pure AR version.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment