Skip to content

Instantly share code, notes, and snippets.

@metaskills
Created November 12, 2008 03:19
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save metaskills/24073 to your computer and use it in GitHub Desktop.
Save metaskills/24073 to your computer and use it in GitHub Desktop.
# Debugging limiting association joins for SQL Server Adapter.
Giving the following limiting SQL that would be normaly be generatd for the
test_eager_load_has_many_through_has_many(AssociationsJoinModelTest) case in
ActiveRecord for SQL Server. In this case the ORDER BY is totally moot.
SELECT DISTINCT TOP 1 [authors].id
FROM [authors]
LEFT OUTER JOIN [posts] ON ([authors].[id] = [posts].[author_id])
LEFT OUTER JOIN [comments] ON ([comments].[post_id] = [posts].[id])
WHERE (name = 'David') ORDER BY comments.id
Note that this SQL is generated by #construct_finder_sql_for_association_limiting
method. We can take two routes here. First we can follow the PostgreSQL adapter
again and resort to adding a connection #distinct and #add_order_by_for_association_limiting!
method that would take the the entire SQL into a sub select and use aliasing. In IRB
it would looks something like this for #distinct
# Arg Values
columns = '[authors].id'
order_by = 'comments.id'
# Execution And Return
>> order_columns = order_by.split(',').collect { |s| s.split.first }
=> ["comments.id"]
>> order_columns.delete_if &:blank?
=> ["comments.id"]
>> order_columns = order_columns.zip((0...order_columns.size).to_a).map { |s,i| "#{s} AS alias_#{i}" }
=> ["comments.id AS alias_0"]
>> sql = "DISTINCT ON (#{columns}) #{columns}, "
=> "DISTINCT ON ([authors].id) [authors].id, "
>> sql << order_columns * ', '
=> "DISTINCT ON ([authors].id) [authors].id, comments.id AS alias_0"
Now for PostgreSQLs #add_order_by_for_association_limiting! method.
# Arg Values
sql = %|SELECT DISTINCT TOP 1 [authors].id
FROM [authors]
LEFT OUTER JOIN [posts] ON ([authors].[id] = [posts].[author_id])
LEFT OUTER JOIN [comments] ON ([comments].[post_id] = [posts].[id])
WHERE (name = 'David')|
options = {:order => 'comments.id'}
# Execution And Return
>> order = options[:order].split(',').collect { |s| s.strip }.reject(&:blank?)
=> ["comments.id"]
>> order.map! { |s| 'DESC' if s =~ /\bdesc$/i }
=> [nil]
>> order.zip((0...order.size).to_a)
=> [[nil, 0]]
>> order = order.zip((0...order.size).to_a).map { |s,i| "id_list.alias_#{i} #{s}" }.join(', ')
=> "id_list.alias_0 "
Now minus the fact that there is no DISTINCT ON in SQL Server, the translated SQL
would look like this and is quite valid. Despite that two columns are being returned
for the select_all, rails will only look for the column that matches the primary_key.
So the alias_n will doubtfully be an issue.
# Before
SELECT DISTINCT TOP 1 [authors].id
FROM [authors]
LEFT OUTER JOIN [posts] ON ([authors].[id] = [posts].[author_id])
LEFT OUTER JOIN [comments] ON ([comments].[post_id] = [posts].[id])
WHERE (name = 'David') ORDER BY comments.id
# Adapter Implementation
def distinct(columns, order_by)
return "DISTINCT #{columns}" if order_by.blank?
order_columns = order_by.split(',').collect { |s| s.split.first }
order_columns.delete_if(&:blank?)
order_columns = order_columns.zip((0...order_columns.size).to_a).map{ |s,i| "#{s} AS alias_#{i}" }.join(', ')
"DISTINCT #{columns}, #{order_columns}"
end
def add_order_by_for_association_limiting!(sql, options)
return sql if options[:order].blank?
order = options[:order].split(',').collect { |s| s.strip }.reject(&:blank?)
order.map! { |s| 'DESC' if s =~ /\bdesc$/i }
order = order.zip((0...order.size).to_a).map { |s,i| "id_list.alias_#{i} #{s}" }.join(', ')
sql.replace "SELECT * FROM (#{sql}) AS id_list ORDER BY #{order}"
end
# After
SELECT * FROM (
SELECT DISTINCT TOP 1 [authors].id, comments.id AS alias_0
FROM [authors]
LEFT OUTER JOIN [posts] ON ([authors].[id] = [posts].[author_id])
LEFT OUTER JOIN [comments] ON ([comments].[post_id] = [posts].[id])
WHERE (name = 'David')
) AS id_list ORDER BY id_list.alias_0
However there is one fatal flaw in this implementation. If the ordered by column
is not unique for the DISTINCT column in the catesian product, un-unique values
will be returned. For instance #test_select_limited_ids_list in HasAndBelongsToManyAssociationsTest
will fail with <["1", "2"]> expected but was <["1", "1", "1", "2"]>. There are two
possible recourses.
First the fact of the matter is that the order by is mostly moot in the limiting
SQL, as far as I can tell from the tests. So this implementation would work just
fine.
def add_order_by_for_association_limiting!(sql, options)
sql
end
However instead of this, we should make an attempt like the PostgreSQL adapter to
maintain ordering for distinct values. Based on that goal the following URL is an
excellent article on our implementation:
http://weblogs.sqlteam.com/jeffs/archive/2007/12/13/select-distinct-order-by-error.aspx
# Adapter Implementation
def add_order_by_for_association_limiting!(sql, options)
return sql if options[:order].blank?
columns = sql.match(/SELECT\s+DISTINCT(.*)FROM/)[1].strip
sql.sub!(/SELECT\s+DISTINCT/,'SELECT')
sql << "GROUP BY #{columns} ORDER BY #{order_to_min_set(options[:order])}"
end
def order_to_min_set(order)
dir = ''
orders = order.split(',').map(&:strip).reject(&:blank?)
mins = orders.map do |o|
if o =~ /\b(asc|desc)$/i
dir = $1
o = o.sub($1,'').strip
end
"MIN(#{o})"
end
"#{mins.join(', ')} #{dir}".strip
end
# Working SQL
SELECT TOP 1 [authors].id
FROM [authors]
LEFT OUTER JOIN [posts] ON ([authors].[id] = [posts].[author_id])
LEFT OUTER JOIN [comments] ON ([comments].[post_id] = [posts].[id])
WHERE (name = 'David')
GROUP BY [authors].id
ORDER BY MIN(comments.id), MIN(comments.post_id)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment