Skip to content

Instantly share code, notes, and snippets.

@danapplegate
Last active August 29, 2015 14:16
Show Gist options
  • Save danapplegate/8c516c280064c06b6b17 to your computer and use it in GitHub Desktop.
Save danapplegate/8c516c280064c06b6b17 to your computer and use it in GitHub Desktop.
Inverse relation using "through"

At several points in our code, we might do something similar to the following:

$parentClass = ParentClasses::model()->findOneByPk($parentClassId);
...
$sessions = $parentClass->sessions;
foreach ($sessions as $session) {
    // Do something with the parent class of the session, perhaps
    // rendering it in a subview
    $parentClass = $session->parentClass;
    echo $parentClass->title;
}

In the same request, we might also do something like this

$units = $parentClass->units;
foreach ($units as $unit) {
    foreach ($unit->sessions as $session) {
        // Render the same subview as above, using the $session's parentClass
        $parentClass = $session->parentClass;
        echo $parentClass->title;
    }
}

Note that in both cases the subview operates on what appears to be the same $parentClass object. However, they are not identical, and have in fact been loaded both using the direct ParentClasses HAS MANY Session and through the indirect ParentClasses HAS MANY Unit HAS MANY Session (can be expressed as ParentClasses HAS MANY Session THROUGH Unit).

In our CActiveRecord lazy-load world, this isn't an apparent problem. If we do nothing, Yii simply does a lazy-load at the point of accessing the $session->parentClass. This is unfortunate, because it leads to lots of database calls to get the same ParentClasses object and they won't be identical. Historically, we've fixed this by eager loading both relations:

$parentClasses = ParentClasses::model()->with('units.sessions.parentClass, sessions.parentClass')->findByPk($pcpk);
foreach ($units as $unit) {
    foreach ($unit->sessions as $session) {
        echo $session->parentClass; // Does not result in an additional DB call
    }
}

This isn't ideal, because we're still doing two lookups for the same parent class, and they aren't identical objects, so loading a relation on one is not reflected in the other. But at least it doesn't break.

In a Hydration world, where a goal is to eliminate these duplicate queries, this presents more of a problem. Hydration will not allow us to lazy-load the relation if we forget to Hydrate. Within our view structure, any one of the following relations may be accessed and must be Hydrated:

$parentClass->sessions;
$session1 = current($parentClass->sessions);
$session1->parentClass; // ParentClass copy 1, must be hydrated using 'sessions.parentClass'

$parentClass->units;
$unit = current($parentClass->units);
$session2 = current($unit->sessions);
$session2->parentClass; // ParentClass copy 2, must be hydrated using 'units.sessions.parentClass'

$unit->parentClass; // ParentClass copy 3, must be hydrated using 'units.parentClass'

// Sometimes we even go a level deeper
$session1->unit->parentClass; // ParentClass copy 4, must be hydrated using 'sessions.unit.parentClass'
$session2->unit->parentClass; // ParentClass copy 5, must be hydrated using 'units.sessions.unit.parentClass'

This leads to confusing hydration lists and many unnecessary DB lookups of the same ParentClasses, as well as multiple, distinct copies of the same ParentClass in memory.

Inverse relations

How do inverse relations fit into this? Consider these examples:

$parentClass->sessions;
$parentClass->units;
$units->sessions;

These can all benefit from inverse relation hydration, because they are first-order relations:

// Hydrate $parentClass relations 'sessions, units'
$session = current($parentClass->sessions); // Inversed, so $session->parentClass will be set
$session->parentClass; // No additional query, same object as $parentClass

$unit = current($parentClass->units); // Inversed
$unit->parentClass; // No additional query, identical to $parentClass

This is an improvement. However, the following example still breaks:

// Hydrate $parentClass relations 'sessions, units.sessions'
$unit = current($parentClass->units); // Inversed, $parentClass is set on each $unit
$session = current($unit->sessions); // Inversed, $unit is set on each $session
$unit->parentClass; // All good
$session->parentClass; // Breaks, because 'units.sessions.parentClass' has not been hydrated

In general, inverse relations alone can make first-order relations more efficient and useful, but cannot help us with second-order relations (the ... HAS MANY ... THROUGH ...) pattern.

Within the API, where we use the Hydration model, we've solved this either by specifying these long and complex hydration lists to account for each of these chained relation cases, or we've manually gone through and set these related objects to minimize db lookups. Neither of these is very scalable or maintainable.

Use through to simplify this pattern

Rails suggests using a :through property on :has_many and :many_many relations. Although the ActiveRecord base will translate this :through into a join, in our Hydration world we can define it as accessing previously Hydrated relations through a relation:

// models/Session.php
class Session {
    public function relations() {
        return [
            ...
            // We cannot inverse these, because this might not be the only one of $unit's
            // or $parentClass's sessions
            'unit' => [self::BELONGS_TO, 'Unit', 'unit_id'],
            'parentClass' => [self::HAS_ONE, 'ParentClasses', 'parent_class_id', 'through' => 'unit']
            ...
        ];
    }
}
// models/ParentClasses.php
class ParentClasses {
    public function relations() {
        return [
            ...
            'units' => [self::INVERSED_HAS_MANY, 'Unit', 'parent_class_id', 'inverseOf' => 'parentClass'],
            'sessions' => [self::HAS_MANY, 'Session', 'parent_class_id', 'through' => 'units']
        ];
    }
}
trait RestModel {
    public function __get($name) {
        if (isset($this->getMetadata()->relations[$name])
            && !empty($this->getMetadata()->relations[$name]->through)) {
            $throughRelation = $this->getMetadata()->relations[$name]->through;
            if (!is_array($this->$throughRelation)) {
                // e.g. $session->unit->parentClass, Session HAS ONE ParentClasses THROUGH Unit
                return $this->$throughRelation->$name;
            } else {
                // e.g. $parentClass->units->sessions, ParentClasses HAS MANY Session THROUGH Unit
                $collected = [];
                foreach ($this->$throughRelation as $related) {
                    $collected = array_merge($collected, $related->$name);
                }
                return $collected;
            }
        }
        return parent::__get($name);
    }
}

This overloads the standard getter to detect if we are trying to access a related object or set of objects through another object or set of objects, looking for the through key to be defined on the relation. This way, second-order relations are forced to be pre-hydrated, but have the advantage of removing redundant db queries and maintaining consistent object identities. $parentClass->units->sessions will all contain references identical to $parentClass.

Limitation of through

However, the fact still remains that there are two ways to hydrate the sessions on a parent class. We can either hydrate sessions directly as a first-order relation, or indirectly through units as a second-order relation. Even though our throughs are set correctly, the hydrator cannot easily tell that $parentClass->sessions are identical to $parentClass->units->sessions, and thus will require separate Hydration calls for them.

Proposed

Any second-order relation that could also be defined as a first-order relation
should only be hydrated and referred to in its second-order form

In other words, in all cases where we try to access $parentClass->sessions directly, we should instead always refer to $parentClass->units->sessions. This can be semantically shortened to $parentClass->sessions, but that should use the second-order definition of the relation to preserve inverse identity. Now, by simply hydrating units.sessions with appropriate, bi-directional, inversed THROUGH relations, all of these now work:

$parentClass->units;
$parentClass->sessions;
$parentClass->units->sessions;
$parentClass->units->sessions->unit;
$parentClass->units->sessions->parentClass;
$parentClass->units->sessions->parentClass->units->parentClass->sessions;
...

Of course, we would never write these like this, but given our nested view structure, this is a potential chain of reference as we dive into and out of subviews.

Remaining Issues

Despite the addition of inverse and through, this "identity" problem still exists for non-inversible relations:

$session = Session::model()->findByPk($sessionId);
Session::hydrate($session, ['unit.parentClass.units.sessions']);
$parentClass = $session->parentClass; // goes 'through' unit
$unit = $session->unit;
$units = $parentClass->units; // $units does not contain $unit. An equivalent
                              // Unit is in the array, but not an identical one.
                              // Any further hydrations done on $units will not
                              // be reflected in $unit.
$sessions = $parentClass->sessions; // Same with $session. $sessions does not
                                    // contain an identical $session

Because $parentClass->units and ->sessions are HAS MANY relations, the $session-> and $unit->parentClass relations cannot be inversed. The $session or $unit that we have may be one of several on the $parentClass, so we cannot simply set it to be the ->units or ->sessions relation on $parentClass. We would need to do an additional DB lookup to get the other sessions. This results in new, non-identical copies of the session and unit to be contained in the $parentClass->sessions and ->units arrays.

To solve this, the Hydrator would need to be smart enough to recognize that, when hydrating the $parentClass->units relation, it's already retrieved one of the $units in memory and should include that exact instance in the result of its hydration.

@PatrickStankard
Copy link

Sweet 👍

@pjdanfor
Copy link

Ballin' Nice write-up, too. Clearly explains the problem and proposed solution.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment