Skip to content

Instantly share code, notes, and snippets.

@jpr5
Created January 14, 2010 19:19
Show Gist options
  • Star 2 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save jpr5/277406 to your computer and use it in GitHub Desktop.
Save jpr5/277406 to your computer and use it in GitHub Desktop.
JOIN logic assessment, logic and fails (0.9x-current)
See associated gist for simple definitions of Project, User, Task, Answer models
referenced herein.
Summary:
I believe the logic about how JOINs are constructed needs to fundamentally
change. In order to create the correct JOIN SQL, the logic needs to be
aware of and account for how an individual query param's "links" are related
to each other *only*, instead of how they the query param's themselves
relate to each other (which is how I intepret how it works now).
To begin with, think of 'project.user' as really 'task.project.user', and
'user.fname' really as 'task.user.fname' -- the Model's own table is the
implicit starting point (but not guaranteed to be a chaining point for any
given JOIN). This is important in knowing when to add an alias for the
originating/Model's table.
Let's call this "path" the "cumulative scope" (CS), which refers to the full
JOIN pathway construction necessary for a given query param and its
associated conditions.
(CE: I'm not entirely sure that the current implementation has access to
these concepts at the time join_statement() is called, so implementing
this logic might require a greater scope of refactoring.)
The idea is to iterate over each param, and "walking" each one's CS
left-to-right, associating a unique table alias for each "hop", then
accumulating any necessary JOIN statements for "hops" that haven't yet been
constructed.
Some Axioms:
(1) if a query param's tailing link results in a new table alias, the
associated where condition must also refer to that same alias.
(2) table aliases should be unique based solely on a given hop's scope/path
from the base object.
(3) tailing relationships should resolve to keys: Task.all('answer.user' =>
obj) should resolve to answer.user_id => obj.primary_key (pseudocode).
This might already be the case, just haven't tested this out yet.
A Demonstration Example
#1 Task.all('user.fname' => 'Foo', 'user.lname' => 'Bar', 'project.user.fname' => 'PHB')
If I were writing this SQL by hand, I'd probably write:
SELECT * FROM tasks
INNER JOIN users users_1 on tasks.task_id = users_1.id
INNER JOIN projects projects_1 on tasks.project_id = projects_1.id
INNER JOIN users users_2 on projects_1.user_id = users_2.id
WHERE users_1.fname = 'Foo' AND
users_1.lname = 'Bar' AND
users_2.name = 'PHB'
How to Logically Generate this SQL
(1) Start with CS['task'] = "tasks", by virtue of the storage_name being
referenced already in the query "SELECT * FROM tasks".
First param: 'user.fname' => 'Foo'
(2) Again, we really mean "task.user.fname". CS['task'] exists, but
not CS['task.user']. So we create the alias for this "hop":
CS['task.user'] = 'users_1'.
(3) Since we just created a CS, we don't have a JOIN set up for it.
Accumulate a JOIN for that CS:
English: create a join from CS['task'] to CS['task.user']
SQL: INNER JOIN users users_1 ON tasks.user_id = users_1.id
(4) Accumulate condition for the param:
English: constrain to any task.user whose fname is 'Foo'
SQL: WHERE users_1.fname = 'Foo'
Second param: 'user.lname' => 'Bar'
(5) CS path is 'task.user', and we already have CS['task'] and
CS['task.user']. The table alias is 'users_1'.
(6) Since we already had a CS/table alias, we don't need any new joins
to service this param.
(7) Accumulate condition for the param:
English: constrain to any task.user whose lname is 'Bar'
SQL: WHERE users_1.lname = 'Bar'
Third param: 'project.user.fname' => 'PHB'
(8) CS path is 'task.project.user'. We already have a CS['task'], but
there's no CS['task.project'] or CS['task.project.user'] yet. So:
CS['task.project'] = 'projects_1'
CS['task.project.users'] = 'users_2' # incremented from endpoint tablenames
(9) Each time we create a new CS, we need to set up a JOIN for it.
Accumulate a JOIN for each new CS:
English: create a join from CS['task'] to CS['task.project']
SQL: INNER JOIN projects projects_1 ON tasks.project_id = projects_1.id
English: create a join from CS['task.project'] to CS['task.project.user']
SQL: INNER JOIN users users_2 ON project_1.user_id = users_2.id
(10) Accumulate condition for the param:
English: constrain to any task.project.user whose fname is 'PHB'
SQL: WHERE users_2.lname = 'PHB'
Resulting SQL:
Original SQL: SELECT * FROM tasks
Joins: INNER JOIN users users_1 ON tasks.user_id = users_1.id
INNER JOIN projects projects_1 ON tasks.project_id = projects_1.id
INNER JOIN users users_2 ON project_1.user_id = users_2.id
Conditions: WHERE users_1.fname = 'Foo'
AND WHERE users_1.lname = 'Bar'
AND WHERE users_2.lname = 'PHB'
Looks just like the original hand-written version.
Random Concluding Thoughts
(1) This algorithm appears capable of supporting convoluted (but not
invalid) invocations like
Task.all('project.tasks.user_id' => [2,1,3])
This would result in SQL that looks like:
SELECT * FROM tasks
INNER JOIN projects projects_1 ON tasks.project_id = projects_1.id
INNER JOIN tasks tasks_1 ON projects_1.id = tasks_1.project_id
WHERE tasks_1.user_id IN (2, 1, 3)
The Tasks table would get aliased once because CS['task'] is not the
same as CS['task.project.task'] in scope.
(2) I'm making an assumption that non-query conditional/join parameters are
already handled/accounted for (like :order, :limit, etc) in logic
elsewhere. But from a regression vs. feature standpoint, I don't know
if:
:order => ['project.tasks.user_id']
is supported (yet); if it is now or will be in the future, the handling
logic would need to be able to communicate (in this case):
CS['task.project.task'] # 'tasks_1'
to the portion of DM's DO Adapter responsible for generating the ORDER
statement. Then it would be easy.
Other Examples to Ponder
#2 Task.all('user.fname' => 'task_owner', 'project.user.fname' => 'project_owner')
English: Give me all tasks belonging to user named 'task_owner', which
belong to a project belonging to a user named 'project_owner'.
SQL: SELECT * FROM tasks
INNER JOIN users users_1 ON tasks.task_id = users_1.id
INNER JOIN projects projects_1 ON tasks.project_id = projects_1.id
INNER JOIN users users_2 ON projects_1.user_id = users_2.id
WHERE users_1.fname = 'task_owner'
AND users_2.fname = 'project_owner'
#3 Task.all('user.fname' => 'task_owner', 'project.user.fname' => 'project_owner', 'answers.user.fname' => 'answer_owner')
English: Give me all tasks belonging to user named 'task_owner', which
belong to a project belonging to a user named 'project_owner',
and which contain answers that belong to a user named
'answer_owner'.
SQL: SELECT * FROM tasks
INNER JOIN users users_1 ON tasks.task_id = users_1.id
INNER JOIN projects projects_1 ON tasks.project_id = projects_1.id
INNER JOIN users users_2 ON projects_1.user_id = users_2.id
INNER JOIN answers answers_1 ON tasks.id = answers_1.task_id
INNER JOIN users users_3 ON answers_1.user_id = users_3.id
WHERE users_1.fname = 'task_owner'
AND users_2.fname = 'project_owner'
AND users_3.fname = 'answer_owner'
require 'rubygems'
require 'dm-core'
DataMapper::Logger.new(STDOUT, :debug)
DataMapper.setup(:default, "mysql://root@localhost/foo")
class Project
include DataMapper::Resource
property :id, Serial
property :title, String
property :user_id, Integer
has 1, :user
has n, :tasks
end
class User
include DataMapper::Resource
property :id, Serial
property :name, String
has n, :tasks
end
class Task
include DataMapper::Resource
property :id, Serial
property :str, String
property :project_id, Integer
property :user_id, Integer
belongs_to :project
belongs_to :user
end
::DataMapper.auto_migrate!
puts "1"
Task.all(:str => 'foo', 'user.name' => 'task_owner').project(:title => 'title', 'user.name' => 'project_owner').to_a
puts "2"
Task.all(:str => 'foo', 'user.name' => 'task_owner').projects(:title => 'title', 'user.name' => 'project_owner').to_a
puts "3"
Task.all(:str => 'foo', 'project.title' => 'title', 'user.name' => 'task_owner', 'project.user.name' => 'project_owner').to_a
require 'rubygems'
require 'ruby-debug'
debugger; 1
### Join Query Generation for current master all the way back to 0.9.11 (NOT A SINGLE CORRECT QUERY :-( )
#
# Tests:
# (1) Task.all(:str => 'foo', 'user.name' => 'task_owner').project(:title => 'title', 'user.name' => 'project_owner').to_a
# (2) Task.all(:str => 'foo', 'user.name' => 'task_owner').projects(:title => 'title', 'user.name' => 'project_owner').to_a
# (3) Task.all(:str => 'foo', 'project.title' => 'title', 'user.name' => 'task_owner', 'project.user.name' => 'project_owner').to_a
#
## Current Master (dm-core 39a2e2c, dm-more ac7bbdf8)
#
#Task.all(:str => 'foo', 'user.name' => 'task_owner').project(:title => 'title', 'user.name' => 'project_owner').to_a
# SELECT `tasks`.`project_id` FROM `tasks`
# INNER JOIN `users` ON `tasks`.`user_id` = `users`.`id`
# WHERE (`users`.`name` = 'task_owner' AND `tasks`.`str` = 'foo')
# SELECT `projects`.`id`, `projects`.`title`, `projects`.`user_id` FROM `projects`
# INNER JOIN `users` ON `projects`.`id` = `users`.`project_id`
# WHERE (`projects`.`title` = 'title' AND `users`.`name` = 'project_owner' AND 1 = 0)
# GROUP BY `projects`.`id`, `projects`.`title`, `projects`.`user_id` ORDER BY `projects`.`id`
#
#Task.all(:str => 'foo', 'user.name' => 'task_owner').projects(:title => 'title', 'user.name' => 'project_owner').to_a
# SELECT `tasks`.`project_id` FROM `tasks`
# INNER JOIN `users` ON `tasks`.`user_id` = `users`.`id`
# WHERE (`users`.`name` = 'task_owner' AND `tasks`.`str` = 'foo')
# SELECT `projects`.`id`, `projects`.`title`, `projects`.`user_id` FROM `projects`
# INNER JOIN `users` ON `projects`.`id` = `users`.`project_id`
# WHERE (`projects`.`title` = 'title' AND `users`.`name` = 'project_owner' AND 1 = 0)
# GROUP BY `projects`.`id`, `projects`.`title`, `projects`.`user_id` ORDER BY `projects`.`id`
#
#Task.all(:str => 'foo', 'project.title' => 'title', 'user.name' => 'task_owner', 'project.user.name' => 'project_owner').to_a
# SELECT `tasks`.`id`, `tasks`.`str`, `tasks`.`project_id`, `tasks`.`user_id` FROM `tasks`
# INNER JOIN `users` ON `tasks`.`user_id` = `users`.`id`
# INNER JOIN `projects` ON `users`.`project_id` = `projects`.`id`
# INNER JOIN `users` `users_1` ON `projects`.`id` = `users_1`.`project_id`
# WHERE (`users`.`name` = 'task_owner' AND `projects`.`title` = 'title' AND `users`.`name` = 'project_owner' AND `tasks`.`str` = 'foo')
# GROUP BY `tasks`.`id`, `tasks`.`str`, `tasks`.`project_id`, `tasks`.`user_id` ORDER BY `tasks`.`id`
## v0.10.2
#
#Task.all(:str => 'foo', 'user.name' => 'task_owner').project(:title => 'title', 'user.name' => 'project_owner').to_a
# SELECT `tasks`.`project_id` FROM `tasks`
# INNER JOIN `users` ON `tasks`.`user_id` = `users`.`id`
# WHERE (`users`.`name` = 'task_owner' AND `tasks`.`str` = 'foo')
# SELECT `projects`.`id`, `projects`.`title`, `projects`.`user_id` FROM `projects`
# INNER JOIN `users` ON `projects`.`id` = `users`.`project_id`
# WHERE (`projects`.`title` = 'title' AND `users`.`name` = 'project_owner' AND 1 = 0)
# GROUP BY `projects`.`id`, `projects`.`title`, `projects`.`user_id`
# ORDER BY `projects`.`id`
#
#Task.all(:str => 'foo', 'user.name' => 'task_owner').projects(:title => 'title', 'user.name' => 'project_owner').to_a
# SELECT `tasks`.`project_id` FROM `tasks`
# INNER JOIN `users` ON `tasks`.`user_id` = `users`.`id`
# WHERE (`users`.`name` = 'task_owner' AND `tasks`.`str` = 'foo')
# SELECT `projects`.`id`, `projects`.`title`, `projects`.`user_id` FROM `projects`
# INNER JOIN `users` ON `projects`.`id` = `users`.`project_id`
# WHERE (`projects`.`title` = 'title' AND `users`.`name` = 'project_owner' AND 1 = 0)
# GROUP BY `projects`.`id`, `projects`.`title`, `projects`.`user_id` ORDER BY `projects`.`id`
#
#Task.all(:str => 'foo', 'project.title' => 'title', 'user.name' => 'task_owner', 'project.user.name' => 'project_owner').to_a
# SELECT `tasks`.`id`, `tasks`.`str`, `tasks`.`project_id`, `tasks`.`user_id` FROM `tasks`
# INNER JOIN `users` ON `tasks`.`user_id` = `users`.`id`
# INNER JOIN `projects` ON `tasks`.`project_id` = `projects`.`id`
# INNER JOIN `users` ON `projects`.`id` = `users`.`project_id`
# WHERE (`users`.`name` = 'task_owner' AND `projects`.`title` = 'title' AND `users`.`name` = 'project_owner' AND `tasks`.`str` = 'foo')
# GROUP BY `tasks`.`id`, `tasks`.`str`, `tasks`.`project_id`, `tasks`.`user_id` ORDER BY `tasks`.`id`
# Not unique table/alias: 'users' (DataObjects::SQLError)
## 0.10.1
#
#Task.all(:str => 'foo', 'user.name' => 'task_owner').project(:title => 'title', 'user.name' => 'project_owner').to_a
# SELECT `tasks`.`id`, `tasks`.`str`, `tasks`.`project_id`, `tasks`.`user_id` FROM `tasks`
# INNER JOIN `users` ON `tasks`.`user_id` = `users`.`id`
# WHERE `users`.`name` = 'task_owner' AND `tasks`.`str` = 'foo'
# GROUP BY `tasks`.`id`, `tasks`.`str`, `tasks`.`project_id`, `tasks`.`user_id` ORDER BY `tasks`.`id`
#
#Task.all(:str => 'foo', 'user.name' => 'task_owner').projects(:title => 'title', 'user.name' => 'project_owner').to_a
# SELECT `tasks`.`id`, `tasks`.`str`, `tasks`.`project_id`, `tasks`.`user_id` FROM `tasks`
# INNER JOIN `users` ON `tasks`.`user_id` = `users`.`id`
# WHERE `users`.`name` = 'task_owner' AND `tasks`.`str` = 'foo'
# GROUP BY `tasks`.`id`, `tasks`.`str`, `tasks`.`project_id`, `tasks`.`user_id` ORDER BY `tasks`.`id`
#
#Task.all(:str => 'foo', 'project.title' => 'title', 'user.name' => 'task_owner', 'project.user.name' => 'project_owner').to_a
# SELECT `tasks`.`id`, `tasks`.`str`, `tasks`.`project_id`, `tasks`.`user_id` FROM `tasks`
# INNER JOIN `users` ON `tasks`.`user_id` = `users`.`id`
# INNER JOIN `projects` ON `tasks`.`project_id` = `projects`.`id`
# INNER JOIN `users` ON `projects`.`id` = `users`.`project_id`
# WHERE `users`.`name` = 'task_owner' AND `projects`.`title` = 'title' AND `tasks`.`str` = 'foo' AND `users`.`name` = 'project_owner'
# GROUP BY `tasks`.`id`, `tasks`.`str`, `tasks`.`project_id`, `tasks`.`user_id` ORDER BY `tasks`.`id`
# Not unique table/alias: 'users' (DataObjects::SQLError)
## 0.10.0
#
#Task.all(:str => 'foo', 'user.name' => 'task_owner').project(:title => 'title', 'user.name' => 'project_owner').to_a
# SELECT `tasks`.`id`, `tasks`.`str`, `tasks`.`project_id`, `tasks`.`user_id` FROM `tasks`
# INNER JOIN `users` ON `tasks`.`user_id` = `users`.`id`
# WHERE `users`.`name` = 'task_owner' AND `tasks`.`str` = 'foo'
# GROUP BY `tasks`.`id`, `tasks`.`str`, `tasks`.`project_id`, `tasks`.`user_id` ORDER BY `tasks`.`id`
#Task.all(:str => 'foo', 'user.name' => 'task_owner').projects(:title => 'title', 'user.name' => 'project_owner').to_a
# SELECT `tasks`.`id`, `tasks`.`str`, `tasks`.`project_id`, `tasks`.`user_id` FROM `tasks`
# INNER JOIN `users` ON `tasks`.`user_id` = `users`.`id`
# WHERE `users`.`name` = 'task_owner' AND `tasks`.`str` = 'foo'
# GROUP BY `tasks`.`id`, `tasks`.`str`, `tasks`.`project_id`, `tasks`.`user_id` ORDER BY `tasks`.`id`
#Task.all(:str => 'foo', 'project.title' => 'title', 'user.name' => 'task_owner', 'project.user.name' => 'project_owner').to_a
# SELECT `tasks`.`id`, `tasks`.`str`, `tasks`.`project_id`, `tasks`.`user_id` FROM `tasks`
# INNER JOIN `users` ON `tasks`.`user_id` = `users`.`id`
# INNER JOIN `projects` ON `tasks`.`project_id` = `projects`.`id`
# INNER JOIN `users` ON `projects`.`id` = `users`.`project_id`
# WHERE `users`.`name` = 'task_owner' AND `projects`.`title` = 'title' AND `tasks`.`str` = 'foo' AND `users`.`name` = 'project_owner'
# GROUP BY `tasks`.`id`, `tasks`.`str`, `tasks`.`project_id`, `tasks`.`user_id` ORDER BY `tasks`.`id`
# Not unique table/alias: 'users' (DataObjects::SQLError)
## 0.10.0-RC2
#
#Task.all(:str => 'foo', 'user.name' => 'task_owner').project(:title => 'title', 'user.name' => 'project_owner').to_a
# SELECT "tasks"."id", "tasks"."str", "tasks"."project_id", "tasks"."user_id" FROM "tasks"
# INNER JOIN "users" ON "tasks"."user_id" = "users"."id"
# WHERE "users"."name" = 'task_owner' AND "tasks"."str" = 'foo'
# GROUP BY "tasks"."id", "tasks"."str", "tasks"."project_id", "tasks"."user_id" ORDER BY "tasks"."id"
#
#Task.all(:str => 'foo', 'user.name' => 'task_owner').projects(:title => 'title', 'user.name' => 'project_owner').to_a
# SELECT "tasks"."id", "tasks"."str", "tasks"."project_id", "tasks"."user_id" FROM "tasks"
# INNER JOIN "users" ON "tasks"."user_id" = "users"."id"
# WHERE "users"."name" = 'task_owner' AND "tasks"."str" = 'foo'
# GROUP BY "tasks"."id", "tasks"."str", "tasks"."project_id", "tasks"."user_id" ORDER BY "tasks"."id"
#
#Task.all(:str => 'foo', 'project.title' => 'title', 'user.name' => 'task_owner', 'project.user.name' => 'project_owner').to_a
# SELECT "tasks"."id", "tasks"."str", "tasks"."project_id", "tasks"."user_id" FROM "tasks"
# INNER JOIN "users" ON "tasks"."user_id" = "users"."id"
# INNER JOIN "projects" ON "tasks"."project_id" = "projects"."id"
# INNER JOIN "users" ON "projects"."id" = "users"."project_id"
# WHERE "users"."name" = 'task_owner' AND "projects"."title" = 'title' AND "tasks"."str" = 'foo' AND "users"."name" = 'project_owner'
# GROUP BY "tasks"."id", "tasks"."str", "tasks"."project_id", "tasks"."user_id" ORDER BY "tasks"."id"
# Not unique table/alias: 'users' (DataObjects::SQLError)
## 0.10.0-RC1
#
#Task.all(:str => 'foo', 'user.name' => 'task_owner').project(:title => 'title', 'user.name' => 'project_owner').to_a
# SELECT "tasks"."id", "tasks"."str", "tasks"."project_id", "tasks"."user_id" FROM "tasks"
# INNER JOIN "users" ON "users"."id" = "tasks"."user_id"
# WHERE "users"."name" = 'task_owner' AND "tasks"."str" = 'foo'
# GROUP BY "tasks"."id", "tasks"."str", "tasks"."project_id", "tasks"."user_id" ORDER BY "tasks"."id"
#
#Task.all(:str => 'foo', 'user.name' => 'task_owner').projects(:title => 'title', 'user.name' => 'project_owner').to_a
# SELECT "tasks"."id", "tasks"."str", "tasks"."project_id", "tasks"."user_id" FROM "tasks"
# INNER JOIN "users" ON "users"."id" = "tasks"."user_id"
# WHERE "users"."name" = 'task_owner' AND "tasks"."str" = 'foo'
# GROUP BY "tasks"."id", "tasks"."str", "tasks"."project_id", "tasks"."user_id" ORDER BY "tasks"."id"
#
#Task.all(:str => 'foo', 'project.title' => 'title', 'user.name' => 'task_owner', 'project.user.name' => 'project_owner').to_a
# SELECT "tasks"."id", "tasks"."str", "tasks"."project_id", "tasks"."user_id" FROM "tasks"
# INNER JOIN "users" ON "users"."id" = "tasks"."user_id"
# INNER JOIN "projects" ON "projects"."id" = "tasks"."project_id"
# INNER JOIN "users" ON "users"."project_id" = "projects"."id"
# WHERE "users"."name" = 'task_owner' AND "projects"."title" = 'title' AND "tasks"."str" = 'foo' AND "users"."name" = 'project_owner'
# GROUP BY "tasks"."id", "tasks"."str", "tasks"."project_id", "tasks"."user_id" ORDER BY "tasks"."id"
# Not unique table/alias: 'users' (DataObjects::SQLError)
## 0.9.11
#
#Task.all(:str => 'foo', 'user.name' => 'task_owner').project(:title => 'title', 'user.name' => 'project_owner').to_a
# SELECT `projects`.`id`, `projects`.`title`, `projects`.`user_id` FROM `projects`
# INNER JOIN `tasks` ON (`projects`.`id` = `tasks`.`project_id`)
# INNER JOIN `users` ON (`users`.`id` = `tasks`.`user_id`)
# WHERE (`users`.`name` = 'task_owner') AND (`tasks`.`str` = 'foo') AND (`users`.`name` = 'project_owner') AND (`projects`.`title` = 'title')
# ORDER BY `projects`.`id`
#
#Task.all(:str => 'foo', 'user.name' => 'task_owner').projects(:title => 'title', 'user.name' => 'project_owner').to_a
# undefined method `projects' for #<DataMapper::Collection:0x102391018> (NoMethodError)
#
#Task.all(:str => 'foo', 'project.title' => 'title', 'user.name' => 'task_owner', 'project.user.name' => 'project_owner').to_a
# SELECT `tasks`.`id`, `tasks`.`str`, `tasks`.`project_id`, `tasks`.`user_id` FROM `tasks`
# INNER JOIN `users` ON (`users`.`id` = `tasks`.`user_id`)
# INNER JOIN `projects` ON (`projects`.`id` = `tasks`.`project_id`)
# INNER JOIN `users` ON (`projects`.`id` = `users`.`project_id`)
# WHERE (`users`.`name` = 'task_owner') AND (`projects`.`title` = 'title') AND (`users`.`name` = 'project_owner') AND (`tasks`.`str` = 'foo')
# ORDER BY `tasks`.`id`
Some Design Thoughts
DM in the Abstract
- Model definition
- Model queries
- Model manipulation
- Model updates
Finder Types
- Model.{finder_specifying_size,relationship}(Model.property => value)
- Model.{finder_specifying_size,relationship}(Model.relationship.{relationship,property} => value)
- Model.{finder_specifying_size,relationship}(some query attributes).{finder_specifying_size,relationship,property}(more query attributes)
Basically, Query's are created from either finders (all, first, last) or relationships.
Variants of query params
- Model.all(:property => :foo)
- Model.all('property' => 'foo')
- Model.all('relationship.property' => :foo)
- Model.all('relationship.relationship.property' => :foo)
- Model.all(*[some set of non-model-related conditions (like limit, etc)])
Variants of chaining
- Using finders: Model.all(:foo => :bar).all(:baz => :blort) [probably easy]
- Using relationships: Model.all(:foo => :bar).relationship(:baz => :blort) [probably harder]
Observations
- Queries chained through relationships fire separate queries
- Should attempt to extend existing logic to detect if the chained call
is a relationship (not just another finder), and merge down any
model-related conditions within
Axioms
(1) Chained relationships merge downward with their scope
Model.all(:foo => "foo").relationship(:bar => "bar", :baz => "baz")
Query('Model.foo' => "foo", 'Model.relationship.bar' => "bar", 'Model.relationship.baz' => 'baz')
(2) Limit implies order, order is on CS.pk
Model.all(:foo => "foo").relationship(:bar => "bar", :limit => 5)
Query('Model.foo' => "foo", 'Model.relationship.bar' => "bar", :limit => 5, :order => Model.relationship.pk)
(3) Non-model-related operators (order, limit, offset) merge downward,
precedence is last-wins
Model.all(:foo => "foo", :limit => 10).all(:bar => "bar", :limit => 5) # limits imply order
Query('Model.foo' => "foo", 'Model.bar' => "bar", :limit => 5, :order => Model.pk)
(4) From previous, finder precedence becomes limiting-first (narrow
scope beats wide scope)
Model.all(:foo => "foo").first(:bar => 'bar')
Query('Model.foo' => "foo", 'Model.bar' => 'bar', :limit => 1, :order => Model.pk)
Logic: DSL->Query [DO/SQL case]
Goal is to reduce all queries chained down to a single base Query that
references CS (aka Paths).
Use case: Task.all(:type => 2, 'user.name' => 'task_owner').project(:type => 7, 'user.name' => "project_owner")
Step 1: Ruby DSL -> Query object
Collapse unkicked chain elements into base-level Query object with CS
(Path) references
<- Task.all(:type => 2, 'user.name' => 'task_owner').project(:type => 7, 'user.name' => "project_owner")
-> Query('Task.type' => 2,
'Task.user.name' => 'task_owner',
'Task.project.type' => 7, 'Task.project.user.name' => "project_owner")
Step 2: Query object -> SQL
Iterate over each model-related Query param to build SQL. For any
non-base-attribute object (Path), accumulate joins and associated query
conditions.
Rough logic documented at: http://gist.github.com/277406#file_1_logic.txt
<- Query('Task.type' => 2,
'Task.user.name' => 'task_owner',
'Task.project.type' => 7, 'Task.project.user.name' => "project_owner")
-> SELECT * FROM tasks
INNER JOIN users users_1 ON tasks.user_id = users_1_id
INNER JOIN projects projects_1 ON tasks.project_id = projects_1.id
INNER JOIN users users_2 ON projects_1.user_id = users_2.id
WHERE tasks.type = 2
AND users_1.name = 'task_owner'
AND projects_1.type = 7
AND users_2.name = 'project_owner'
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment