Skip to content

Instantly share code, notes, and snippets.

@shehi
Forked from guiwoda/AR_Cache_Repository.php
Created February 28, 2018 12:42
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save shehi/9cd90cecc20a9ef2706420ffeb90b6ca to your computer and use it in GitHub Desktop.
Save shehi/9cd90cecc20a9ef2706420ffeb90b6ca to your computer and use it in GitHub Desktop.
AR (Eloquent) vs DM (Doctrine) gist

Common pitfalls found in AR / Eloquent

Data-driven modeling

AR focuses on data. Moreover, Eloquent makes this data public. Objects designed around an Eloquent model assume public access to those pieces of data, so encapsulation is harder and cohesion is blurred. Tracking where properties are accessed or modified is harder, even with advanced IDEs, because of magic properties and weak type hinting.

Validation and invariants

In the context of a Laravel project, data gets validated as arrays (most likely user input extracted from the Request) and, later on, added to an Eloquent model through generic methods:

According to its author, "Laravel has no opinion on where or how you do validation." By default, all your models will inherit generic methods to be constructed or updated:

Model::create(array $attributes);
Model::update($id, array $attributes);
$model->fill(array $attributes);
new Model(array $attributes);

This behavior moves the responsibility of validation and protecting invariants outside of the model. Depending on the architecture of the project, this may be a service layer, a command or an http layer (controller, form request). Having this responsibility outside of the object makes for weak objects, whose state can't be trusted to be valid.

Intercepting this behavior is very difficult, as it involves either overriding multiple methods of the Model class, some of which the Model itself assumes to be safe to call (empty constructors, for example).

Performance optimizations

While it is entirely possible to do performance optimizations in the AR pattern, Eloquent lacks work in this area. It has no IdentityMap to prevent hitting the database for the same record, does not handle join queries into joined models (with possible column collisions if done wrong!) and has removed it's small cache implementation since version 5.0.

The only methods that allows for query optimization are the with / load relation eager-loading methods.

Implementing any sort of cache means intercepting internal ORM calls, which by the level of coupling between Model, Query\Builder and Relations, would need to be done at the ORM level. Overriding methods at the Model level cannot accomplish any of this.

Generic API

Eloquent models inherit a large, generic API that assumes all models will need. This can be a problem when working in large teams, because knowledge of how things should be done is passed through convention instead of enforced by code. For example, a Model::all() method call could be a self-destruct button in a rather large table. Preventing calls from an already existing public method is more difficult than preventing the creation of a yet-to-be-added method.

While this is true for Doctrine as well (generic repositories also have self-destruct findAll() methods), Eloquent using inheritance makes this worst: it gets all of these methods closer to the consumer, which is a negative point in the case of unwanted API, and it statically couples to it, so you can't hide it behind an injected dependency.

Common pitfalls found in DM / Doctrine

Object - Relational impedance mismatch

DataMapper assumes that the object's state can be modeled in a relational manner, but most of the times we end up adapting our modeling decisions to this restriction. This topic is older than Doctrine itself, and while DMs have evolved through the years, it's still a very important constraint.

Wikipedia on O-R impedance

Complexity

While database access and usage is no simple task, the mapping layer adds an extra level of complexity to it, one that ActiveRecord explicitly avoids. Reconsitution of objects from the database is dealt by the ORM, and that assumes an internal structure of the mapped objects, which also limits design. Hooking to those processes is possible, but demonstrates how much more complex it is than just overriding a method.

Anemic domain modeling

Anemic Domain Modeling is modeling objects with public setters and getters and no real behavior outside of transporting data around. This incurs in the cost of domain modeling, without the benefits of actually adding behavior related to the domain, as described by Fowler in his bliki.

While this is not intrinsic of the DM pattern, it does have something to do with it. In an empty AR model, behavior is always present: AR gives you database access for all your models. But if you design an Entity that does not know about database and does not have any relevant behavior, then you are arguably worst than with AR.

<?php
namespace App\ActiveRecord;
class PostRepository
{
private $cache;
public function __construct(Cache $cache)
{
// Any set() / get() cache implementation.
$this->cache = $cache;
}
public function find($id)
{
$result = $this->cache->get("posts:$id");
if (! $result) {
$result = Post::find($id);
$this->cache->set("posts:$id", $result);
}
return $result;
}
public function findFromAuthor(User $author)
{
$results = $this->cache->get("posts:author:" . $author->id);
if (! $results) {
$results = Post::fromAuthor($author)->get();
$this->cache->set("posts:author:" . $author->id, $results);
}
return $results;
}
}
/**
* Something as simple as this already has problems:
*
* 1. Cache done in this repository doesn't affect in any way the relationship between User <-> Post.
* This means that $user->posts will still call the database, so architecture has to force the relation
* to be loaded from a Repository, breaking the ActiveRecord pattern.
*
* 2. Invalidation of a single Post has to invalidate all queries, otherwise stale data will still be found
* through the cached query results. This results in very poor cache scenarios and possible cache slams
* that make for a very fragile performance optimization.
*
* 3. Even if posts get cached by id, any other model will still hit database through their relations,
* for example a Comment's posts belongsTo relation. This makes caching harder because it's not on the ORM level.
*/
<?php
namespace App\ActiveRecord\Models;
use Illuminate\Database\Eloquent\Model;
class Post extends Model
{
protected $guarded = [];
public function user()
{
return $this->belongsTo(User::class);
}
public function comments()
{
return $this->hasMany(Comment::class);
}
}
// Q: What does this do?
// A: It models a post record. It has public access to the post data (read and write) and to its relations.
// Q: How am I supposed to use it?
// A: All its API is inherited and used through magic public properties. Database access is modeled through public
// methods such as save(), update(), create(), static find() and by using the query builder. Magic methods called
// scopes can be added to model specific data access or add global restrictions.
// Q: What data does it have?
// A: Check the database table.
// Q: Is there any pre-conditions that I should care about?
// A: All invariants and pre-conditions are delegated to consumers, at least by default. You have to add a magical
// set[prop]Attribute method if you want to protect invariants in this class. Setter methods can be bypassed
// through this magic properties if you don't, so you'd also have to throw exceptions or override the __set
// method if you go that way.
<?php
namespace App\ActiveRecord;
class PostConsumer
{
public function publish(User $user, $title)
{
return Models\Post::create([
'user_id' => $user->id,
'title' => $title,
]);
}
public function find($id)
{
return Models\Post::find($id);
}
public function complexListing()
{
return Models\Post::where('a_database_field', 'a_value')
->where('db_field_2', 'another_value')
->orderBy('db_field_3')
->get();
}
}
// Pros:
// Very easy to use.
// Cons:
// Data structure leaked out.
// Static access makes PostConsumer hard to unit test without database calls.
// Common:
// Both have flexible data access implementations.
// Both strategies are easily tested through integration tests with a real database.
<?php
namespace App\DataMapper\Entities;
class Post
{
private $id;
private $title;
private $author;
private $comments;
public function __construct(User $author, $title)
{
$this->title = $title;
$this->author = $author;
$this->comments = new ArrayCollection();
}
public function getTitle()
{
return $this->title;
}
public function getAuthor()
{
return $this->author;
}
public function addComment(Comment $comment)
{
$this->comments[] = $comment;
}
public function getComments()
{
return $this->comments->getValues();
}
}
// Q: What does this do?
// A: It models a post. It has private access to its data and its relations, and it exposes some of it
// through public methods.
// Q: How am I supposed to use it?
// A: All its API is explicit in the object. It has no inherited behavior.
// Q: What data does it have?
// A: Its data is explicit in its private properties and some of it may be exposed through its public API.
// Q: Is there any pre-conditions that I should care about?
// A: Constructors and mutators are modeled on a per-case basis, so each mutation will be able to enforce its
// invariants. It has no defaults, as it has no inherited code.
<?php
namespace App\DataMapper;
class PostConsumer
{
public function publish(User $user, $title)
{
$post = new Post($user, $title);
$this->entityManager->persist($post);
$this->entityManager->flush();
}
public function find($id)
{
return $this->entityManager->find(Entities\Post::class, $id);
}
public function complexListing()
{
$repo = $this->entityManager->getRepository(Entities\Post::class);
return $repo->findBy([
'aPostObjectField' => 'a_value',
'anotherPostObjectField' => 'another_value',
], 'orderableField');
}
}
// Pros:
// Easy to unit test without database calls (dependency on EM and Repository can be mocked)
// No database structure leaked
// Cons:
// More complex.
// string references to private field names suggest leaks as well.
// Common:
// Both have flexible data access implementations.
// Both strategies are easily tested through integration tests with a real database.
<?php
namespace App\DataMapper;
class PostRepository
{
private $posts;
public function __construct(ObjectRepository $posts)
{
$this->posts = $posts;
}
public function find($id)
{
return $this->posts->find($id);
}
public function findFromAuthor(User $author)
{
return $this->posts->findBy([
'author' => $author
]);
}
}
/**
* I leave this here because Doctrine has, since 2.5, a second level cache implementation that would take
* care of both scenarios using the default Repository implementation.
*/
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment