vsavkin/rich_domain_models2.md

## rich_domain_models2.md

      
    Raw
  

              rich_domain_models2.md
            
          
    Building Rich Domain Models in Rails.

Part 1. Decoupling Persistence.

Abstract

Domain model is an effective tool for software development. It can be used to express really complex business logic, and to verify and validate the understanding of the domain among stakeholders. Building rich domain models in Rails is hard. Primarily, because of Active Record, which doesn't play well with the domain model approach.
One way to deal with this problem is to use an ORM implementing the data mapper pattern. Unfortunately, there is no production ready ORM doing that for Ruby. DataMapper 2 is going to be the first one.
Another way is to use Active Record just as a persistence mechanism and build a rich domain model on top of it. That's what I'm going to talk about in this article.
Problems with Active Record

First, let's take a look at some problems caused by using a class extending Active Record for expressing a domain concept:


The class is aware of Active Record. Therefore, you need to load Active Record to run your tests.


An instance of the class is responsible for saving and updating itself. It makes mocking and stubbing harder.


Every instance exposes such low-level methods as 'update_attribute!'. They give you too much power of changing the internal state of objects. Power corrupts. That's why you see 'update_attributes' used in so many places.


"Has many" associations allow bypassing an aggregate root. Too much power, and as we all know, it corrupts.


Every instance is responsible for validating itself. It's hard to test. On top of that, it makes validations much harder to compose.


Solution

Following Rich Hickey's motto of splitting things apart, the best solution I see is to split every Active Record class into three different classes:

Entity
Data Object
Repository

The core idea here is that every entity when instantiated is given a data object. The entity delegates its fields' access to the data object. The data object doesn't have to be an Active Record object. You can always provide a stub or an OpenStruct instead. Since the entity is a plain old ruby object, it doesn't know how to save/validate/update itself. It also doesn't know how to fetch itself from the database.
A repository is responsible for fetching data objects from the database and constructing entities. It is also responsible for creating and updating entities. To cope with its responsibilities the repository has to know how to map data objects to entities. A registry of all data objects and their correspondent entities is created to do exactly that.
Example

Let's take a look at a practical application of this approach. Order and Item are two entities that form an aggregate. This is the schema we can use to store them in the database:
create_table "orders", :force => true do |t|
  t.decimal  "amount", :null => false
  t.date     "deliver_at"
  t.datetime "created_at", :null => false
  t.datetime "updated_at", :null => false
end

create_table "items", :force => true do |t|
  t.string   "name", :null => false
  t.decimal  "amount", :null => false
  t.integer  "order_id", :null => false
  t.datetime "created_at", :null => false
  t.datetime "updated_at", :null => false
end

As you can see we didn't have to adapt the schema for our approach.
All entities are plain old ruby objects including the Model module:
class Order
  include Model

  # Delegates id, id=, amount, amount=, deliver_at, deliver_at to the data object
  fields :id, :amount, :deviver_at

  # ...
end

class Item
  include Model

  fields :id, :amount, :name
end

where the Model module is defined as:
module Model
  def self.included(base)
    base.extend ClassMethods
  end

  attr_accessor :_data

  def initialize _data = _new_instance
    if _data.kind_of?(Hash)
      @_data = _new_instance _data
    else
      @_data = _data
    end
  end

  protected

  #...

  def _new_instance hash = {}
    # Using the registry to get the correspondent data class
    Registry.data_class_for(self.class).new hash
  end

  module ClassMethods
    def fields *field_names
      field_names.each do |field_name|
        self.delegate field_name, to: :_data
        self.delegate "#{field_name}=", to: :_data
      end
    end
  end
end

As the Order and Item classes form an aggregate, we can get a reference to an item only through its order. Therefore, we need to implement only one repository:
module OrderRepository
  extend Repository

  # All ActiveRecord classes are defined in the repository.
  class OrderData < ActiveRecord::Base
    self.table_name = "orders"

    attr_accessible :amount, :deliver_at

    validates :amount, numericality: true
    has_many :items, class_name: 'OrderRepository::ItemData', foreign_key: 'order_id'
  end

  class ItemData < ActiveRecord::Base
    self.table_name = "items"

    attr_accessible :amount, :name

    validates :amount, numericality: true
    validates :name, presence: true
  end

  # Mappings between models and data objects are defined here.
  # "root:true" means that the OrderData class will be used
  # when working with this repository.
  set_model_class Order, for: OrderData, root: true
  set_model_class Item, for: ItemData

  def self.find_by_amount amount
    where(amount: amount)
  end
end

Where the Repository module is defined as:
module Repository
  def persist model
    data(model).save!
  end

  def find id
  	model_class.new(data_class.find id)
  end

  protected 

  def where attrs
    # We search the database using the root data class and wrap
    # the results into the instances of the model class.
    data_class.where(attrs).map do |data|
      model_class.new data
    end
  end

  def data model
    model._data
  end

  def set_model_class model_class, options
    raise "Data class is not provided" unless options[:for]

    Registry.associate(model_class, options[:for])

    if options[:root]
      singleton_class.send :define_method, :data_class do
        options[:for]
      end

      singleton_class.send :define_method, :model_class do
        model_class
      end
    end
  end
end

Now, let's see how we can use all these classes in an application.
test "using a data object directly (maybe used for reporting purposes)" do
  order = OrderRepository::OrderData.create! amount: 10, deliver_at: Date.today
  order.items.create! amount: 6, name: 'Item 1'
  order.items.create! amount: 4, name: 'Item 2'

  assert_equal 2, order.reload.items.size
  assert_equal 6, order.items.first.amount
end

test "using a saved model" do
  order_data = OrderRepository::OrderData.create! amount: 10, deliver_at: Date.today

  order = Order.new(order_data)
  order.amount = 15

  assert_equal 15, order.amount
end

test "creating a new model" do
  order = Order.new
  order.amount = 15

  assert_equal 15, order.amount
end

test "using hash to initialize a model" do
  order = Order.new amount: 15

  assert_equal 15, order.amount
end

test "using a repository to fetch models from the database" do
  OrderRepository::OrderData.create! amount: 10, deliver_at: Date.today

  orders = OrderRepository.find_by_amount 10

  assert_equal 10, orders.first.amount
end

test "persisting models" do
  order = Order.new amount: 10

  OrderRepository.persist order

  assert order.id.present?
  assert_equal 10, order.amount
end

test "using a struct instead of a data object (can be used for testing)" do
  order = Order.new OpenStruct.new
  order.amount = 99
  assert_equal 99, order.amount
end

Associations

One important aspect of building rich domain models hasn't been covered yet. How are the associations between an aggregate root and its childrens managed? How do we access items?
The simplest approach would be to build an array of Item ourselves using the active record association.
class Order
  include Model

  fields :id, :amount, :deliver_at
 
  def items
    _data.items.map{|i| Item.new i}
  end 

  def add_item attrs
	Item.new(_data.items.new attrs))
  end
end

The problem here is that everyone is forced to use the _data variable, which is really undesirable. We can provide a controlled accessor to the data object by adding the collection and wrap methods the Model module.
module Model

  # Returns a rails has_many.
  def collection name
    _data.send(name)
  end

  # Wraps a collection of items into instances of the model class.
  def wrap collection
    return [] if collection.empty?
    model_class = Registry.model_class_for(collection.first.class)
    collection.map{|c| model_class.new c}
  end
end

Order using collection and wrap:
class Order
  include Model

  def items
    wrap(collection :items)
  end 

  def add_item attrs
	wrap(collection(:items).new attrs)
  end
end

Though the changes may not seem significant at first, they are crucial. There is no need to access the _data variable anymore. On top of that, we don't have to create instances of Item ourselves.
But the collection and wrap methods are just bare minimum. One can easily imagine the syntax sugar we can add on top of them.
module Model
  module ClassMethods
    def collections *collection_names

      collection_names.each do |collection_name|
      
        define_method collection_name do
          wrap(collection collection_name)
        end
      end

    end
  end
end


class Order
  include Model

  fields :id, :amount, :deliver_at
  collections :items

  def add_item attrs
    wrap(collection(:items).new attrs)
  end
end

Now, let's see how we can use it in our application:
test "using a saved aggregate with children" do
  order_data = OrderRepository::OrderData.create! amount: 10, deliver_at: Date.today
  order_data.items.create! amount: 6, name: 'Item 1'

  order = Order.new order_data

  assert_equal 6, order.items.first.amount
end


test "persisting an aggregate with children" do
  order = Order.new amount: 10
  order.add_item name: 'item1', amount: 5

  OrderRepository.persist order

  from_db = OrderRepository.find(order.id)

  assert_equal 5, from_db.items.first.amount
end

Validations

Since data objects are hidden, and aren't supposed to be accessed directly by the client code, we need to change the way we run validations. There are lots of available options, one of which is the following:
module DataValidator
  def self.validate model
    data = model._data
    data.valid?
    data.errors.full_messages
  end
end

That's how you'd use it in the code:
test "using data validation for a saved model" do
  order_data = OrderRepository::OrderData.create! amount: 10, deliver_at: Date.today

  order = Order.new(order_data)
  assert_equal [], DataValidator.validate(order)
end

test "using data validation for a new model" do
  order = Order.new amount: 10
  assert_equal [], DataValidator.validate(order)
end

You don't have to return an array of strings. It can be a hash or even a special object. The idea here is to separate entities from their validations. Once again, by splitting things apart we end up with a better design. Why? For one thing, we can compose validations in run time based on, for instance, user settings. For another thing, we can validate a group of objects together, so there is no need to copy errors from one object to another.
Architecture

Separating persistence from domain model has a tremendous impact on the architecture of our applications. The following is the traditional Rails app architecture.

That’s what we get if we separate persistence.

You don't have to be one of the Three Amigos to see the flaws of the traditional Rails app architecture: the domain classes depend on the database and Rails. Whereas, the architecture illustrated by the second diagram doesn’t have these flaws, which allows us to keep the domain logic abstract and framework agnostic.
What we got


The persistence logic has been extracted into OrderRepository. Having a separate object is beneficial in many ways. For instance, it simplifies testing, as it can be mocked up or faked.


Instances of Order and Item are no longer responsible for saving or updating themselves. The only way to do it is to use domain specific methods.


Low-level methods (such as update_attributes!) are no longer exposed.


There is no ItemRepository and no has_many associations. The result of it is an enforced aggregate boundary.


Having validations separated enables better composability and simplifies testing.


Wrapping Up

The suggested approach is fairly simple, but provides some real value when it comes to expressing complex domains.  The approach plays really with legacy applications. Nothing has to be rewritten or redesigned from scratch. Just start using your existing Active Record models as data classes when building new functionality.