Skip to content

Instantly share code, notes, and snippets.

@cavalle
Last active December 11, 2015 21:09
Show Gist options
  • Save cavalle/4660239 to your computer and use it in GitHub Desktop.
Save cavalle/4660239 to your computer and use it in GitHub Desktop.
Cohesion and Big ActiveRecord Models

Cohesion and Big ActiveRecord Models

One of the problems of big ActiveRecord models is their low cohesion. In Rails we group behaviour around the entities of the domain we're modelling. If we use only ActiveRecord models for that, we'll probably end up with classes full of actions, in most cases, completely unrelated to each other (except for the fact that they act on the same domain entity).

To illustrate the issue let's imagine a big Post AR model of a blog application. In this app, users can add tags to their posts. Since this is the only class with this ability, I decide to put the related logic in the Post model itself. For this, presumably at the beginning of the file, I write a couple of has_many entries for tags and taggings. Then, fifty lines below, along with the rest of scopes, I add one more to find posts by tag name. A couple of hundred lines later, I put a virtual attribute tag_list which will be used to update the associations from a string of tag names separated by commas. Finally, at the bottom of my gigantic model, a private method will contain the nuts and bolts of the transformation of the tag_list.

I don't know about you, but I find this often in my code. All the logic related to the same concern is scattered across a 600-line file. Every time the tagging functionality changes, I have to jump back and forth until I find the code that I need to update. This is specially inconvenient having associations, validations and callbacks which purpose, out of context, isn't always easy to understand.

Putting together what changes together

The fundamental rule of thumb is to put things together that change together.

This is how Martin Fowler encourage us to write highly cohesive code. That's one of the qualities I like my code to have in order to keep it enjoyable to maintain.

In our example, this is something we can achieve by extracting the tagging logic to a concern:

module Taggable
  extend ActiveSupport::Concern

  included do
    has_many :taggings
    has_many :tags, through: :taggings

    scope :tagged_with, ->(tag_name) do
      joins(:tags).where(tags: { name: tag_name }) 
    end
  end

  def tag_list
    tags.pluck(:name).join(', ')
  end

  def tag_list=(tag_list)
    assign_tag_list tag_list
  end

  private

  def assign_tag_list(tag_list)
    tag_names = tag_list.split(',').map(&:strip).uniq

    self.tags = tag_names.map do |tag_name| 
      Tag.where(name: tag_name).first_or_initialize
    end
  end
end

Note that I chose a concern here because it seems a solution particularly appropriate, but there are other solutions that we could have used. The point is that the logic that we've extracted is all that is likely to change together and nothing else.

It's important to be aware that I'm adding a layer of indirection that I didn't have before. It's the price I decide to pay for the benefits I get in terms of cohesion.

Small objects can have problems too

Nevertheless, big AR models aren't the only objects with problems of cohesion, as we're about to see.

Let's say that I've recently read the excellent “7 Patterns to Refactor Fat ActiveRecord Models” in the Code Climate blog and I decide to use a Policy object to encapsulate the rule that stipulates what's considered an active user in my application:

class ActiveUserPolicy
  def initialize(user)
    @user = user
  end

  def active?
    @user.email_confirmed? &&
    @user.last_login_at < 14.days.ago
  end
end

(This snippet is extracted from the original post. Following snippets are mine)

With this object, everywhere I need to know whether a user is active or not, all I have to do is:

ActiveUserPolicy.new(user).active?

In my applications, I often find that wherever I have a policy like this one I also need to be able to fetch the list of objects that comply with it. For that, another of the patterns mentioned in the Code Climate article will come in handy, a Query object:

class ActiveUserQuery
  def initialize(relation = User.scoped)
    @relation = relation
  end

  def find_each(&block)
    @relation.
      where(email_confirmed: true).
      where('last_login < ?', 14.days.ago).
      find_each(&block)
  end
end

The merit of using this solution is that I have two objects with just one responsibility. However, as you might have noticed, not everything is good. There's a problem of low cohesion here as well.

The main reason to change of these two objects is the same: the business rule that defines what is an active user. If tomorrow, for example, I decide that users can be disabled and disabled users aren't considered active, I'll have to change both the policy and the query object. That means that I have logic changing always at the same time that lives in two different places.

My personal preference to improve this situation would probably be, again, to use a concern:

module User::Active
  extend ActiveSupport::Concern

  included do
    scope :active, -> do
      where(email_confirmed: true).
      where('last_login < ?', 14.days.ago)
    end
  end

  def active?
    user.email_confirmed? &&
    user.last_login_at < 14.days.ago
  end
end

This way there's only one place to go to change this single aspect of my domain. Alternatively, @brynary himself suggests doing this, which also improves cohesion very nicely and is consistent with the services approach it seems I was following in this example.

Closing words

It's easy to identify big ActiveRecord models in our applications. What it might be more difficult is to split them and reorganise our domain logic to make our codebase more pleasurable to maintain. There are several patterns to choose from and different considerations to bear in mind. One of the things that usually helps is trying to keep your objects and modules highly cohesive by putting together whatever is likely to change together.

@gaizka
Copy link

gaizka commented Jan 30, 2013

Great article!

I am really, really happy with the concerns approach. Everything is pretty encapsulated, you always know exactly which file your method is going to be, etc.

This is the beginning of one of our big classes :)

class CoolPromo
  include ContestCommon
  include ContestCommon::OneTimePayments
  include ContestCommon::ConditionsAndPicture
  include ContestCommon::SpecialProPublishRules
  include ContestCommon::HasParticipations
  include ContestCommon::GeneralValidations
  include ContestCommon::VideoValidations
  include ContestCommon::MultipleLikes
  include ContestCommon::LikesRecruited
  include ContestCommon::HasBuzz
  include ContestCommon::HasWhiteLabel
  include ContestCommon::Joinable
  include ContestCommon::JoinableMultipleTimes
  include ContestCommon::HasComments
  include ContestCommon::HasVisits
  include EtagGeneration::Promos
...

As we talked at the Alfredo's rb, what I have not been able to do (and I'm not sure if it's worth it) would be testing each behaviour as a separate entity. That would assure me that there are no hidden dependencies between behaviors.

Something like this:

# We need to define a new table with only the columns needed to be able to test this behaviour.
class MySlimModelNotTheRealOne
    include ContestCommon::HasParticipations
end
spec "ContestCommon::HasParticipations" do
    ...
    ...
end

Instead of testing the fat-tish model.

My feelings tell me that's the "purest" thing to do, but it takes too much effort, IMO...

@cavalle
Copy link
Author

cavalle commented Jan 30, 2013

I plan to talk about concerns more specifically in future writings. But it's true that testing Rails concerns in "isolation" might be hard (particularly when they contain calls to ActiveRecord macros). I suppose the "natural" way of testing those types of concerns is as part of one of the models that include them.

@diasjorge
Copy link

When splitting into modules your concerns need to be as you said highly cohesive but from my experience people tend to split modules that have duplicated words but not duplicated concepts in the name of keeping things DRY.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment