Skip to content

Instantly share code, notes, and snippets.

@mikekelly
Forked from practicingruby/part1.md
Created May 22, 2012 12:01
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save mikekelly/2768609 to your computer and use it in GitHub Desktop.
Save mikekelly/2768609 to your computer and use it in GitHub Desktop.

Inheritance is a key concept in most object-oriented languages, but applying it skillfully can be challenging in practice. Back in 1989, M. Sakkinen wrote a paper called Disciplined inheritance that addresses these problems and offers some useful criteria for working around them. Despite being more than two decades old, this paper is extremely relevant to the modern Ruby programmer.

Sakkinen's central point seems to be that most traditional uses of inheritance lead to poor encapsulation, bloated object contracts, and accidental namespace collisions. He provides two patterns for disciplined inheritance and suggests that by normalizing the way that we model things, we can apply these two patterns to a very wide range of scenarios. He goes on to show that code that conforms to these design rules can easily be modeled as ordinary object composition, exposing a solid alternative to traditional class-based inheritance.

These topics are exactly what this two-part article will cover, but before we can address them, we should establish what qualifies as inheritance in Ruby. The general term is somewhat overloaded, so a bit of definition up front will help start us off on the right foot.

Flavors of Ruby inheritance

Although classical inheritance is centered on the concept of class-based hierarchies, modern object-oriented programming languages provide many different mechanisms for code sharing. Ruby is no exception: it provides four common ways to model inheritance-based relationships between objects.

  • Classes provide a single-inheritance model similar to what is found in many other object-oriented languages, albeit lacking a few privacy features.

  • Modules provide a mechanism for modeling multiple inheritance, which is easier to reason about than C++ style class inheritance but is more powerful than Java's interfaces.

  • Transparent delegation techniques make it possible for a child object to dynamically forward messages to a parent object. This technique has similar effects as class-/module-based modeling on the child object's contract but preserves encapsulation between the objects.

  • Simple aggregation techniques make it possible to compose objects for the purpose of code sharing. This technique is most useful when the subobject is not meant to be a drop-in replacement for the superobject.

Although most problems can be modeled using any one of these techniques, they each have their own strengths and weaknesses. Throughout both parts of this article, I'll point out the trade-offs between them whenever it makes sense to do so.

Modeling incidental inheritance

Sakkinen describes incidental inheritance as the use of an inheritance-based modeling approach to share implementation details between dissimiliar objects. That is to say that child (consumer) objects do not have an is-a relationship to their parents (dependencies) and therefore do not need to provide a superset of their parent's functionality.

In theory, incidental inheritance is easy to implement in a disciplined way because it does not impose complex constraints on the relationships between objects within a system. As long as the child object is capable of working without errors for the behaviors it is meant to provide, it does not need to take special care to adhere to the Liskov Substitution Principle. In fact, the child needs only to expose and interact with the bits of functionality from the parent object that are specifically relevant to its domain.

Regardless of the model of inheritance used, Sakkinen's paper suggests that child objects should rely only on functionality provided by immediate ancestors. This is essentially an inheritance-oriented parallel to the Law of Demeter and sounds like good advice to follow whenever it is practical to do so. However, this constraint would be challenging to enforce at the language level in Ruby and may not be feasible to adhere to in every imaginable scenario. In practice, the lack of adequate privacy controls in Ruby make traditional class hierarchies or module mixins quite messy for incidental inheritance, which complicates things a bit. But before we discuss that problem any further, we should establish what incidental inheritance looks like from several different angles in Ruby.

In the following set of examples, I construct a simple Report object that computes the sum and average of numbers listed in a text file. I break this problem into three distinct parts: a component that provides functionality similar to Ruby's Enumerable module, a component that uses those features to do simple calculations on numerical data, and a component that outputs the final report. The contrived nature of this scenario should make it easier to examine the structural differences between Ruby's various ways of implementing inheritance relationships, but be sure to keep some more realistic scenarios in the back of your mind as you work through these examples.

The classical approach of using a class hierarchy for code sharing is worth looking at, even if most practicing Rubyists would quickly identify this as the wrong approach to this particular problem. It serves as a good baseline for identifying the problems introduced by inheritance and how to overcome them. As you read through the following code, think of its strengths and weaknesses, as well as any alternative ways to model this scenario that you can come up with.

class EnumerableCollection
  def count
    c = 0
    each { |e| c += 1 }
    c
  end

  # Samnang's implementation from Issue 2.4
  def reduce(arg=nil) 
    return reduce {|s, e| s.send(arg, e)} if arg.is_a?(Symbol)

    result = arg
    each { |e| result = result ? yield(result, e) : e }

    result
  end
end

class StatisticalCollection < EnumerableCollection
  def sum
    reduce(:+) 
  end

  def average
    sum / count.to_f
  end 
end

class StatisticalReport < StatisticalCollection
  def initialize(filename)
    self.input = filename
  end

  def to_s
    "The sum is #{sum}, and the average is #{average}"
  end

  private 

  attr_accessor :input

  def each
    File.foreach(input) { |e| yield(e.chomp.to_i) }
  end
end

puts StatisticalReport.new("numbers.txt")

Through its inheritance-based relationships, StatisticalReport is able to act as a simple presenter object while relying on other reusable components to crunch the numbers for it. The EnumerableCollection and StatisticalCollection objects do most of the heavy lifting while managing to remain useful for a wide range of different applications. The division of responsibilities between these components is reasonably well defined, and if you ignore the underlying mechanics of the style of inheritance being used here, this example is a good demonstration of effective code reuse.

Unfortunately, the devil is in the details. When viewed from a different angle, it's easy to see a wide range of problems that exist even in this very simple application of class-based inheritance:

  1. It is possible to create instances of EnumerableCollection and StatisticalCollection but not possible to do anything meaningful with them as they are currently written. Although it's not necessarily a bad idea to make use of abstract classes, valid uses of that pattern typically invert the relationship shown here, with the child object filling in a missing piece so that its parent can do a complex job.

  2. Although StatisticalReport relies on only two relatively generic methods from StatisticalCollection and StatisticalCollection similarly relies on only two methods from EnumerableCollection, the use of class inheritance forces a rigid hierarchical relationship between the objects. Even if it's not especially awkward to say a StatisticalCollection is an EnumerableCollection, it's definitely weird to say that a StatisticalReport is also an EnumerableCollection. What makes matters worse is that this sort of modeling prevents StatisticalReport from inheriting from something more topically related to its domain such as a HtmlReport or something similar. As my favorite OOP rant proclaims, class hierarchies do not exist simply to satisfy our inner Linnaeus.

  3. There is no encapsulation whatsoever between the components in this system. The purely functional nature of both EnumerableCollection and Statistics make this less of a practical concern in this particular example but is a dangerous characteristic of all code that uses class-based inheritance in Ruby. Any instance variables created within a StatisticalReport object will be directly accessible in method calls all the way up its ancestor chain, and the same goes for any methods that StatisticalReport defines. Although a bit of discipline can help prevent this from becoming a problem in most simple uses of class inheritance, deep method resolution paths can make accidental collisions of method definitions or instance variable names a serious risk. Such a risk might be mitigated somewhat by the introduction of class-specific privacy controls, but they do not currently exist in Ruby.

  4. As a consequence of points 2 and 3, the StatisticalReport object ends up with a bloated contract that isn't representative of its domain model. It'd be awkward to call StatisticalReport#count or StatisticalReport#reduce, but if those inherited methods are not explicitly marked as private in the StatisticalReport definition, they will still be callable by clients of the StatisticalReport object. Once again, the stateless nature of this program makes the effects less damning in this particular example, but it doesn't take much effort to imagine the inconsistencies that could arise due to this problem. In addition to real risks of unintended side effects, this kind of modeling makes it harder to document the interface of the StatisticalReport in a natural way and diminishes the usefulness of Ruby's reflective capabilities.

At least some of these issues can be resolved through the use of Ruby's module-based mixin functionality. The following example shows how our class-based code can be trivially refactored to use modules instead. Once again, as you read through the code, think of its strengths and weaknesses as well as how you might approach the problem differently if it were up to you to design this system.

module SimplifiedEnumerable
  def count
    c = 0
    each { |e| c += 1 }
    c
  end

  # Samnang's implementation from Issue 2.4
  def reduce(arg=nil) 
    return reduce {|s, e| s.send(arg, e)} if arg.is_a?(Symbol)

    result = arg
    each { |e| result = result ? yield(result, e) : e }

    result
  end
end

module Statistics
  def sum
    reduce(:+) 
  end

  def average
    sum / count.to_f
  end 
end

class StatisticalReport
  include SimplifiedEnumerable
  include Statistics

  def initialize(filename)
    self.input = filename
  end

  def to_s
    "The sum is #{sum}, and the average is #{average}"
  end

  private 

  attr_accessor :input

  def each
    File.foreach(input) { |e| yield(e.chomp.to_i) }
  end
end

puts StatisticalReport.new("numbers.txt")

Using module mixins does not improve the encapsulation of the components in the system or solve the problem of StatisticalReport inheriting methods that aren't directly related to its problem domain, but it does alleviate some of the other problems that Ruby's class-based inheritance causes. In particular, it makes it no longer possible to create instances of objects that wouldn't be useful to use as standalone objects and also loosens the dependencies between the components in the system.

Although the Statistics and SimplifiedEnumerable modules are still not capable of doing anything useful without being tied to some other object, the relationship between them is much looser. When the two are mixed into the StatisticalReport object, an implicit relationship between Statistics and SimplifiedEnumerable exists due to the calls to reduce and count from within the Statistics module, but this relationship is an implementation detail rather than a structural constraint. To see the difference yourself, think about how easy it would be to switch StatisticalReport to use Ruby's Enumerable module instead of the SimplifiedEnumerable module I provided and compare that to the class-based implementation of this scenario.

The bad news is that the way that modules solve some of the problems that we discovered about class hierarchies in Ruby ends up making some of the other problems even worse. Because modules tend to provide a whole lot of functionality based on a very thin contract with the object they get mixed into, they are one of the leading causes of child obesity. For example, swapping my SimplifiedEnumerable module for Ruby's Enumerable method would cause a net increase of 42 new methods that could be directly called on StatisticalReport. And now, rather than having a single path to follow in StatisticalReport to determine its ancestry chain, there are two. A nice feature of mixins is that they have fairly simple rules about how they get added to the method lookup path to avoid some of the complexities involved in class-based multiple inheritance, but you still need to memorize those rules and be aware of the combinatorial effects of module inclusion.

As it turns out, modules are a pragmatic compromise that is convenient to use but only slightly more well-behaved than traditional class inheritance. In simple situations, they work just fine, but for more complex systems they end up requiring an increasing amount of discipline to use effectively. Nonetheless, modules tend to be used ubiquitously in Ruby programs despite these problems. A naïve observer might assume that this is a sign that we don't have a better way of doing things in Ruby, but they would be mostly wrong.

All the problems discussed so far with inheritance can be solved via simple aggregation techniques. For strong evidence of that claim, take a look at the refactored code shown here. As in the previous examples, keep an eye out for the pros and cons of this modeling strategy, and think about what you might do differently.

class StatisticalCollection
  def initialize(data)
    self.data = data
  end

  def sum
    data.reduce(:+) 
  end

  def average
    sum / data.count.to_f
  end 

  private

  attr_accessor :data
end

class StatisticalReport
  def initialize(filename)
    self.input = filename
    
    self.stats = StatisticalCollection.new(each)
  end

  def to_s
    "The sum is #{stats.sum}, and the average is #{stats.average}"
  end

  private 

  attr_accessor :input, :stats

  def each
    return to_enum(__method__) unless block_given?

    File.foreach(input) { |e| yield(e.chomp.to_i) }
  end
end

puts StatisticalReport.new("numbers.txt")

The first thing you'll notice is that the code is much shorter, as if by magic, but really it's because I completely cheated here and got rid of my counterfeit Enumerable object so that I could expose a potentially good idiom for dealing with iteration in an aggregation-friendly way. Feel free to mentally replace the object passed to StatisticalCollection's constructor with something like the code shown here if you don't want me to get away with parlor tricks:

require "forwardable"

class EnumerableCollection
  extend Forwardable

  # Forwardable bypasses privacy, which is what we want here.
  delegate :each => :data

  def initialize(data)
    self.data = data
  end

  def count
    c = 0
    each { |e| c += 1 }
    c
  end

  # Samnang's implementation from Issue 2.4
  def reduce(arg=nil) 
    return reduce {|s, e| s.send(arg, e)} if arg.is_a?(Symbol)

    result = arg
    each { |e| result = result ? yield(result, e) : e }

    result
  end

  private

  attr_accessor :data
end

Regardless of what iteration strategy we end up using, the following points are worth noting about the way we've modeled our system this time around:

  1. There are three components in this system, all of which are useful and testable as standalone objects.

  2. The relationships between all three components are purely indirect, and the coupling between the objects is limited to the names and behavior of the methods called on them rather than their complete surfaces.

  3. There is strict encapsulation between the three components: each have their own namespace, and each can enforce their own privacy controls. It's possible of course to side-step these protections, but they are at least enabled by default. The issue of accidental naming collisions between methods or variables of objects is completely eliminated.

  4. As a result of points 2 and 3, the surface of each object is kept narrowly in line with its own domain. In fact, the public interface of StatisticalReport has been reduced to its constructor and the to_s method, making it about as thin as possible while still being useful.

There are certainly downsides to using aggregation; it is not a golden hammer by any means. But when it comes to incidental inheritance, it seems to be the right tool for the job more often than not. I'd love to hear counterarguments to this claim, though, so please do share them if you have something in mind that you don't think would gracefully fit this style of modeling.

Reflections

Although it may be a bit hard to see why disciplined inheritance matters in the trivial scenario we've been talking about throughout this article, it become increasingly clear as systems become more complex. Most scenarios that involve incidental inheritance are actually relatively horizontal problems in nature, but the use of class-based inheritance or module mixins forces a vertical method lookup path that can become very unwieldy, to say the least. When taken to the extremes, you end up with objects like ActiveRecord::Base, which has a path that is 43 levels deep, or Prawn::Document, which has a 26-level-deep path. In the case of Prawn, at least, this is just pure craziness that I am ashamed to have unleashed upon the world, even if it seemed like a good idea at the time.

In a language like Ruby that lacks both multiple inheritance and true class-specific privacy for variables and methods, using class-based hierarchies or module mixins for complex forms of incidental inheritance requires a tremendous amount of discipline. For that reason, the extra effort involved in refactoring towards an aggregation-based design pales in comparison to the maintenance headaches caused by following the traditional route. For example, in both Prawn and ActiveRecord, aggregation would make it possible to flatten that chain by an order of magnitude while simultaneously reducing the chance of namespace collisions, dependencies on lookup order, and accidental side effects due to state mutations. It seems like the cost of somewhat more verbose code would be well worth it in these scenarios.

In Issue 3.8, we will move on to discuss an essential form of inheritance that Sakkinen refers to as completely consistent inheritance. Exploring that topic will get us closer to the concept of mathematical subtypes, which are much more interesting at the theoretical level than incidental inheritance relationships are. But because Ruby's language features make even the simple relationships described in this issue somewhat challenging to manage in an elegant way, I am still looking forward to hearing your ideas and questions about the things I've covered so far.

A major concern I have about incidental inheritance is that I still don't have a clear sense of where to draw the line between the two extremes I've outlined in this article. I definitely want to look further into this area, so please leave a comment if you don't mind sharing your thoughts on this.

Originally published in Practicing Ruby as Issue 3.7

In Issue 3.7 I started to explore the criteria laid out by M. Sakkinen's Disciplined Inheritance, a language-agnostic paper published over two decades ago that is surprisingly relevant to the modern Ruby programmer. In this issue, we continue where Issue 3.7 left off, with the question of how to maintain complete compatibility between parent and child objects in inheritance-based domain models. Or put another way, this article explores how to safely reuse code within a system without it becoming a maintenance nightmare.

After taking a closer look at what Sakkinen exposed about this topic, I came to realize that the ideas he presented were strikingly similar to the Liskov Substitution Principle. In fact, the extremely dynamic nature of Ruby makes establishing a behavioral notion of sub-typing (Liskov/Wing 1993) a prerequisite for developing disciplined inheritance practices. As a result, this article refers to Liskov's work moreso than Sakkinen's, even though both papers have extremely interesting things to say about this topic.

Defining a behavioral subtype

Both Sakkinen and Liskov describe the essence of the inheritance relationship as the ability for a child object to serve as a drop-in replacement wherever its parent object can be used. While I've greatly simplified the concept by stating it in such a general fashion, this is the thread which ties their independent works together.

Liskov goes a step farther than Sakkinen by defining two kinds of behavioral subtypes: children which extend the behavior specified by their parents, and children which constrain the behavior specified by their parents. These concepts are not mutually exclusive, but because each brings up its own set of challenges, it is convenient to split them out in this fashion.

Both Sakkinen and Liskov emphasize that the abstract concept of subtyping is much more about the observable behavior of objects than it is about what exactly is going on under the hood. This is a natural way of thinking for Rubyists, and is worth keeping in mind as you read through the rest of this article. In particular, when we talk about the type of an object, we are focusing on what that object does, not what it is.

While the concept of a behavioral subtype sound like a direct analogue for what we commonly refer to as "duck typing" in Ruby, the former is about the full contract of an object, rather than how it acts under certain circumstances. I go into more detail about the differences between these concepts towards the end of this article, but before we can discuss them meaningfully, we need to take a look at Liskov's two types of behavioral subtyping and how they can be implemented.

Behavioral subtypes as extensions

Whether you realize it or not, odds are you already are familiar with using behavioral subtypes as extensions. Whenever we inherit from ActiveRecord::Base or mix Enumerable into one of our objects, we're making use of this concept. In essence, the purpose of an extension is to bolt new behavior on top of an existing type to form a new subtype.

To ensure that our child objects maintains the substitution principle, we need to make sure that any new behavior and modifications introduced by extensions follow a few simple rules. In particular, all new functionality must either be completely transparent to the parent object or defined in terms of the parent object's functionality. Changing the signature of a method provided by the parent object would be considered an incompatible change, as would directly modifying instance variables referenced by the parent object. These strict rules may seem a bit overkill, but they are the only way to guarantee that your extended subtypes will be drop in replacements for their supertypes.

In practice, obeying these rules is not as hard as it seems. For example, suppose we wanted to extend Prawn::Document so that it implements some helpers for article typesetting:

Prawn::Article.generate("test.pdf") do
  h1 "Criteria for Disciplined Inheritance"
 
  para %{
    This is an example of building a Prawn-based article
    generator through the use of a behavioral subtype as
    an extension. It's about as wonderful and self-referential
    as you might expect.
  }

  h2 "Benefits of behavioral subtyping"

  para %{
    The benefits of behavioral subtyping cannot be directly
    known without experiencing them for yourself.
  }

  para %{
    But if you REALLY get stuck, try asking Barbara Liskov.
  }
end

The most simple way to implement this sort of DSL would be to create a subclass of Prawn::Document, as shown in the following example:

module Prawn
  class Article < Document
    include Measurements

    def h1(contents)
      text(contents, :size => 24)
      move_down in2pt(0.3)
    end

    def h2(contents)
      move_down in2pt(0.1)
      text(contents, :size => 16)
      move_down in2pt(0.2)
    end

    def para(contents)
      text(contents.gsub(/\s+/, " "))
      move_down in2pt(0.1)
    end
  end
end

As far as Liskov is concerned, Prawn::Article is a perfectly legitimate extension because instances of it are drop-in substitutes for Prawn::Document objects. In fact, this sort of extension is trivial to prove to be a behavioral subtype because it is defined purely in terms of public methods that are provided by its parents (Prawn::Document and Prawn::Measurements). Because the functionality added is so straightforward, the use of subclassing here might just be the right tool for the job.

The downside of using subclassing is that even minor alterations to program requirements can cause encapsulation-related issues to become a real concern. For example, if we decide that we want to add a pair of instance variables that control the fonts used for headers and paragraphs, it would be hard to guarantee that these variables wouldn't clash with the data contained within Prawn::Document objects. While we can assume that calls to public methods provided by the parent object are safe, we cannot say the same when it comes to referencing instance variables, and so a delegation-based model starts to look more appealing.

Suppose we wanted to support the following API, but through the use of delegation rather than subclassing:

Prawn::Article.generate("test.pdf") do
  header_font    "Courier"
  paragraph_font "Helvetica"

  h1 "Criteria for Disciplined Inheritance"
 
  para %{
    This is an example of building a Prawn-based article
    generator through the use of a behavioral subtype as
    an extension. It's about as wonderful and self-referential
    as you might expect.
  }

  h2 "Benefits of behavioral subtyping"

  para %{
    The benefits of behavioral subtyping cannot be directly
    known without experiencing them for yourself.
  }

  para %{
    But if you REALLY get stuck, try asking Barbara Liskov.
  }
end

Using a method_missing hook and a bit of manual delegation for the Prawn::Article.generate class method, it is fairly easy to implement this DSL:

module Prawn
  class Article
    def self.generate(*args, &block)
      Prawn::Document.generate(*args) do |pdf|
        new(pdf).instance_eval(&block)
      end
    end

    def initialize(document)
      self.document = document      
      document.extend(Prawn::Measurements)

      # set defaults so that @paragraph_font and @header_font are never nil.
      paragraph_font "Times-Roman"
      header_font    "Times-Roman"
    end

    def h1(contents)
      font(header_font) do
        text(contents, :size => 24)
        move_down in2pt(0.3)
      end
    end

    def h2(contents)
      font(header_font) do
        move_down in2pt(0.1)
        text(contents, :size => 16)
        move_down in2pt(0.2)
      end
    end

    def para(contents)
      font(paragraph_font) do
        text(contents.gsub(/\s+/, " "))
        move_down in2pt(0.1)
      end
    end

    def paragraph_font(font=nil)
      return @paragraph_font = font if font

      @paragraph_font
    end

    def header_font(font=nil)
      return @header_font = font if font

      @header_font
    end

    def method_missing(id, *args, &block)
      document.send(id, *args, &block)
    end

    private

    attr_accessor :document
  end
end

Taking this approach involves writing more code and adds some complexity. However, that is a small price to pay for the peace of mind that comes along with cleanly separating the data contained within the Prawn::Article and Prawn::Document objects. This design also makes it harder for Prawn::Article to have name clashes with Prawn::Document's private methods, and forces any private method calls to Prawn::Document to be done explicitly. Because transparent delegation exposes the full contract of the parent object, it is still necessary for the child object to maintain full compatibility with those methods in the same manner that a class-inheritance based model would. Nonetheless, this pattern provides a safer way to implement subtypes because it avoids incidental clashes which could otherwise occur easily.

While the examples we've looked at so far combined with your own experiences should give you a good sense of how to extend code via behavioral subtypes, there are some common pitfalls I have glossed over in order to keep things simple. I'll get back to those before the end of the article, but for now we will turn our attention to the other kind of subtypes Liskov describes in her paper. She refers to them as constrained subtypes, but I've decided to call them restriction subtypes as an easy to remember mirror image of the extension subtype concept.

Behavioral subtypes as restrictions

Just as subtypes can be used to extend the behavior of a supertype, they can also be used to restrict generic behaviors by providing more specific implementations of them. The example Liskov uses in her paper illustrates how a stack structure can be viewed as a restriction on the more general concept of a bag.

In its most simple form, a bag is essentially nothing more than a set which can contain duplicates. Items can be added and removed from a bag, and it is possible to determine whether the bag contains a given item. However, much like a set, order is not guaranteed. While somewhat of a contrived example, the following code implements a Bag object similar to the one described in Liskov's paper:

ContainerFullError  = Class.new(StandardError)
ContainerEmptyError = Class.new(StandardError)

class Bag
  def initialize(limit)
    self.items  = [] 
    self.limit = limit
  end

  def push(obj)
    raise ContainerFullError unless data.length < limit

    data.shuffle!
    data.push(obj)
  end

  def pop
    raise ContainerEmptyError if data.empty?

    data.shuffle!
    data.pop
  end

  def include?(obj)
    data.include?(obj)
  end

  private

  attr_accessor :items, :limit
end

The challenge in determing whether a Stack object can meaningfully be considered a subtype of this sort of structure is that we need to find a way to describe the functionality of a bag so that it is general enough to allow for interesting subtypes to exist, but specific enough to allow the Bag object to be used on its own in a predictable way. Because Ruby lacks the design-by-contract features that Liskov depends on in her paper, we need to describe this specification verbally rather than relying on our tools to inforce them for us. Something like the list of rules below are roughly similar to what she describes more formally in her work:

  1. A bag has items and a size limit.

  2. A bag has a push operation which adds a new object to the bag's items

  • If the current number of items is less than the limit, the new object is added to the bag's items.

  • Otherwise, a ContainerFullError is raised.

  1. A bag has a pop operation which removes an object from the bag's items and returns it as a result.
  • If the bag has no items, a ContainerEmptyError is raised.

  • Otherwise one object is removed from the bag's items and returned.

  1. A bag has an include? operation which indicates whether the provided object is one of bag's items.
  • If the bag's items contains the provided object, true is returned

  • Otherwise, false is returned.

With these rules in mind, we can see that the following Stack object satisfies the definition of a bag while simultaneously introducing a predictable ordering to items.

ContainerFullError  = Class.new(StandardError)
ContainerEmptyError = Class.new(StandardError)

class Stack
  def initialize(limit)
    self.items  = [] 
    self.limit = limit
  end

  def push(obj)
    raise ContainerFullError unless data.length < limit

    data.push(obj)
  end

  def pop
    raise ContainerEmptyError if data.empty?

    data.pop
  end

  def include?(obj)
    data.include?(obj)
  end

  private

  attr_accessor :items, :limit
end

With the above code in mind, we can specify the behavior of a stack in the following way:

  1. A stack is a bag

  2. A stack's pop operation follows a last-in, first-out (LIFO) order.

Because the ordering requirements of a stack don't conflict with the defining characteristics of a bag, a stack can be substituted for a bag without any issues. The key thing to keep in mind here is that restriction subtypes can add additional constraints on top of what was specified by their supertypes, but cannot loosen the constraints put upon them by their ancestors in any way. For example, based on the way we defined bag objects, we would not be able to return nil instead of raising ContainerEmptyError when pop is called on an empty stack, even if that seems like a fairly innocuous change.

Once again, maintaining this sort of discipline may seem on the surface to be more trouble than it is worth. However, these kinds of assumptions are baked into useful patterns such as the template method pattern, and are also key to designing type hierarchies for all sorts of data structures. A good example of these concepts in action can be found in the way Ruby organizes its numeric types. The class hierarchy is shown below, but be sure to check out Ruby's documentation if you want to get a sense of how exactly these classes hang together.

Whether you are designing extension subtypes or restriction subtypes, it is unfortunately easier to get things wrong than it is to get them right, due to all the subtle issues that need to be considered. For that reason, we'll take a look now at a few examples of flawed behavioral subtypes, and how to go about fixing them.

Examples of flawed behavioral subtypes

To test your understanding of behavior subtype compatibility while simultaneously exposing some common pitfalls, I've provided three flawed examples for you to study. As you read through them, try to figure out what the subtype compatibility problem is, and how you might go about solving it.

  1. Suppose we want to add an equality operator to the bag structure. A sample operator is provided below for the Bag object, which conforms to the following newly specified feature: "Two bags are considered equal if they have equivalent items and size limits". What problems will we encounter in implementing a bag-compatible equality operator for the Stack object?
class Bag
  # other code similar to before

  def ==(other)
    [data.sort, limit] == [other.sort, other.limit]
  end

  protected 
  
  # NOTE: Implementing == is one of the few legitimate uses of 
  # protected methods / attributes
  attr_accessor :data, :limit
end
  1. Suppose we have two mutable objects, a Rectangle and a Square, and we wish to implement Square as a restriction of Rectangle. Given the following implementation of a Rectangle object, what problems will be encountered in defining a Square object?
class Rectangle
  def area
    width * height
  end

  attr_accessor :width, :height
end
  1. Suppose we have a PersistentSet object which delegates all method calls to the Set object provided by Ruby's standard library, as shown in the following code. Why is this not a compatible subtype, even though it does not explicitly modify the behavior of any of Set's operations?
require "set"
require "pstore"

class PersistentSet 
  def initialize(filename)
    self.store = PStore.new(filename)

    store.transaction { store[:data] ||= Set.new }
  end

  def method_missing(name, *args, &block)
    store.transaction do 
      store[:data].send(name, *args, &block)
    end
  end

  private

  attr_accessor :store
end

To avoid spoiling the fun of finding and fixing the defects with these examples yourself, I've hidden my explanation of the problems and solutions on a pair of gists. Please spend some time on this exercise before reading the spoilers, as you'll learn a lot more that way!

A huge hint is that the first problem is based on an issue discussed in Liskov's paper, and the second and third problems are discussed in an article about LSP by Bob Martin. However, please note that their solutions are not exactly the most natural fit for Ruby, and so there is still room for some creativity here.

Behavioral subtyping vs. duck typing

Between this article and the topics discussed in Issue 3.7, this two part series has given a fairly comprehensive view of disciplined inheritance practices for the Ruby programmer. However, as I hinted at towards the beginning of this article, there is the somewhat looser concept of duck typing that deserves a mention if we really want to see the whole picture.

What duck typing and behavioral subtypes have in common is that both concepts rely on what an object can do rather than what exactly it is. Where they differ is that behavioral subtypes seem to be more about the behavior of an entire object while duck typing is about how a given object behaves within a certain context. Duck typing can be a good deal more flexible than behavioral subtyping in that sense, because typically it involves an object implementing a meaningful response to a single message rather than an entire suite of behaviors. You can find a ton of examples of duck typing in use in Ruby, but perhaps the most easy to spot one is the ubiquitous use of the to_s method.

By implementing a to_s method in our objects, we are able to indicate to Ruby that our object has a meaningful string representation, which can then be used in a wide range of contexts. Among other things, the to_s method is automatically called by irb when an inspect method is not also provided, called by the Kernel#puts method on whatever object you pass to it, and called automatically on the result of any expression executed via string interpolation. Implementing a meaningful to_s method is not exactly a form of behavioral subtyping, but is still a very useful form of code sharing. Issue 1.14 and Issue 1.15 cover duck typing in great detail, but this single example is enough to point out the merits of this technique and how much more simple it is than the topics discussed in this article.

Reflections

A true challenge for any practicing Rubyist is to find a balance between the free-wheeling culture of Ruby development and the more rigorous approaches of our predecessors. Disciplined inheritance techniques will make our lives easier, and knowing what a behavioral subtype is and how to design one will be sure to come in handy on any moderately complex project. However, we should keep our eyes trained on how these issues relate to maintainability, understandability, and changeability rather than obsessing about how they can lead us to mathematically pure designs.

I think there is room for another article on the practical applications of these ideas, in which I might talk about applying some design-by-contract concepts to Ruby programs, or how to develop shared unit tests which make it easier to test for compatibility when implementing subtypes. But I don't plan to work on that immediately, so for now we can sort out those issues via comments on this article. If you have any suggestions for how to tie these ideas back to real problems, or questions on how to apply them to the things you've been working on, please share your thoughts.

Originally published in Practicing Ruby as Issue 3.8

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment