bf4/gist:67754b65ce30c37fad6a

## gistfile1.md

      
    Raw
  

              gistfile1.md
            
          
    OO solutions for imperative problems

[september, 22nd 2013]
On september 17th, 2013, Aaron Cruz wrote in the Objects on Rails mailing list:

I have a very imperative problem. Something needs to be done, then something else, then something else, constantly changing state from one step to the next.
Video Import example

Look in directory for files in a hierarchy of folders
Organize by folder names
Group videos by suffix or prefix in filename
Persist videos
Send to workers to transcode


This sort of problems arise frequently. Most solutions I see are often created in a very procedural way, albeit with a charming aura of object-orientation generated out of thin air by beautiful phrases and fuzzy arguments.
It's easier to think about things as if they were a list of tasks written in a paper, one after the other. In the real world, though, things don't happen that way. You don't have a pipeline, but possibilities distributed similarly to branches, with multiple ways of proceeding in each node of that tree. For that reason, pipelines are a bad way of implementing things most of the times.
Let's define some principles which we're not going to violate (which are basicaly rules, but no one likes rules, right?):

Principle #1: the classical object is not always reflected by an object in Ruby. Before OO, a Computer had sub-parts which were simply procedural mechanisms. After OO, each sub-part is considered a computer in itself. So you have a Computer with sub-Computers. In that sense, a Computer, as an object, could be represented as the traditional namespace in Ruby. Consider the following from The Early Story of Smalltalk, by Alan Kay:


The basic principal of recursive design is to make the parts have the same power as the whole. For the first time I thought of the whole as the entire computer and wondered why anyone would want to divide it up into weaker things called data structures and procedures. Why not divide it up into little computers, as time sharing was starting to? But not in dozens. Why not thousands of them, each simulating a useful structure?


Principle #2: an object should not know about the internals of other objects
Principle #3: an object should collaborate with other objects using messages


OOP to me means only messaging, local retention and protection, hiding of state - Alan Key

Let's take the following example, a proposed OO solution given in that same mailing list:
files = FileFinder.new.call(folders)
files = FolderOrganizer.new(by: :name).call(files)
grouped = VideoGrouper.new(by: [:name, :prefix]).call(grouped)
Let's name a few problems we see with this "pipeliney" approach.
Problem #1: The class object which encapsulates this code knows what it shouldn't care about. Let's consider that it's all inside a class called VideoImporter (I know, the Er suffix, but let's carry on within the idea of the author for now).
VideoImporter knows a) what folders videos are in (see the folders variable), b) it knows that files should be found, c) that it should be organized, d) that videos should be grouped. The implementation example isn't complete, but it would probably know even more.
Problem #2: Object are not collaborating because, well, they're not sending meaningful messages to each other. What does it mean to use .call in an object? It's not telling it to do anything. You can't easily understand what's going on.
If you needed a cup of coffee, would you get to Starbucks and say something like "Call." to the attendant? No! So why use these meaningless method calls? In order to understand What is happening, you have to open that class (e.g FolderOrganizer) and figure out How things are done. It's so much cognitive load I can't even reason about.
Problem #3: as soon as you add another source of videos, say, a REST API, anything other then local files, you'll have to start implementing if's all over the place.
Write it down in a piece of paper and read it every morning: things in the real world are not going to happen one after the other 100% of the time. Instead, what best represents reality is the picture of tree, where in each node, a new set of possibilities arise. Your pipeline will have to address new possibilities, be it because of physical restrictions or requirements from the client.
Conclusion: this is not OO.

The last thing you wanted any programmer to do is mess with internal state even if presented figuratively. Instead, the objects should be presented as sites of higher level behaviors more appropriate for use as dynamic components. [...]It is unfortunate that much of what is called “object-oriented programming” today is simply old style programming with fancier constructs. Many programs are loaded with “assignment-style” operations now done by more expensive attached procedures.

An OO solution

In order to present a good solution, let's draw some additional principles now:

Principle #4: don't name classes ending in Er. A class should represent a concept. If you do use Er, you're probably missing a concept and breaking encapsulation.

Quoting Peter Coad:

Challenge any class name that ends in "-er" (e.g. Manager or Controller). If it has no parts, change the name of the class to what each object is managing. If it has parts, put as much work in the parts that the parts know enough to do themselves.

I see this very often. VideoImporter, TableExporter, SubscriptionModifier, ConnectionManager, FileFinder, PluginManager, CsvLoader, APIRequestor, UserCreator, CreditCardCharger etc.
In all these situations, the opportunity for defining proper communication via meaningful messages is thrown away. VideoImporter should become video.import, table.export, credit_card.charge, and so on. You'll notice that for each Er word, we could transform it in a meaningful message.
Regarding encapsulation, if you have Connection and ConnectionManager, the second probably know about the internal state of Connection, which goes against OO in every way.
A real world example

Let's start. What will we be importing? A Video. Right, so, what's a good class name?
class Video
end
Tough, right?
The requirements for the "pipeline" above are: 1) Look in directory for files in a hierarchy of folders, 2) organize by folder names, 3) group videos by suffix or prefix in filename, 4) persist videos and 5) send to workers to transcode.
Let's abstract this, disregarding details. Let's forget files, dirs, names etc for a moment. To import a video, we need to do only two things:

have a list of pending videos (in this case, it's in the filesystem, but could be a REST API, for example)
save them

class Video
  def initialize(io_object)
    @io_object = io_object #io compliant object
  end
  
  def import
    Persistence::Video.new(self).save
    # where does the Transcoding part goes? I'll show in a minute.
  end
end

# we could call this like this
video = Video.new(some_file_instance)
video.import
Let's consider that we're getting these files via FTP. We want to abstract files with proper objects and order to be import. We want to be able to write something like this:
Ftp::File.find(".avi").map(&:import)
Let's be honest, by reading this you know exactly what's going on. You don't need to know how videos are imported, you only care that they're being imported. The implementation details would be something like this.
module Ftp
  class Video
    def initialize(file)
      @file = file
    end
    
    def import
      ::Video.new(self).save
    end
    
    private
    
    attr_reader :file
  end
  
  class File
    def initialize(filepath)
      @filepath = filepath
    end
    
    def import
      content_type_instance.import
    end
    
    def self.find(filepath)
      # untested, but it should get the files, instantiate each one
      # with File and sort them by filepath
      Dir[path].map { |filepath| self.class.new(filepath) }.sort { |file| file.filepath }
    end
    
    private
    
    attr_reader :path
    
    def content_type_instance
      # we'd have some logic for different content types here, but
      # given we don't have requirements for them, let's just have
      # Video. Consider how easy it is to use other content
      # types here. I'd use Polymorphism in the real world, something
      # like `extension.camelize.constantize.new(self)`, which would
      # allow me to extend new types without having to change this
      # class, but let's keep this example simple.
      Video.new(self)
    end
  end
end
This is nice. Now, let's say that we're in a Rake task or a Rails controller, we don't want to know if the pending files are in a FTP server or sent via HTTP, or SSH or whatever. We just don't care, we only want pending videos. We want this:
RemoteResources::Videos.new.import_all
Why RemoteResources? Because in our domain, anything that's not saved is remote, outside our application. We encapsulate this concept in a module, as if it was a computer on its own.
module RemoteResources
  class Videos
    def import_all
      Ftp::File.find(whitelisted_extensions).map(&:import)
      # S3::File.find(:videos).map(&:import)
    end
    
    private
    
    def whitelisted_extensions
      ".avi"
    end
  end
end
See? We can use ::Video from anywhere, from FTP to REST APIs to S3 to anything. A pipeline doesn't allow you to have the DRY principle in place very easily. The moment you have to add source for videos other than FTP, you have to dismantle the pipeline or create a new one.

Principle #5: when naming classes, consider the concept you want to represent, and then think about different contexts.

We have one concept: Video. However, a video is a different thing in different contexts. A video in the context of FTP is a mere file, but we, humans, look at it as a video, not file. OO is about abstrating things in such a way that it's easier for us to think about them.
And so, we end up with 3 different contexts:

Video: the video in our application.
Persistence::Video: a video in the context of the database. It could be a table or whatever, we just look at it as a persisted video.
Ftp::Video: a video in the context of the filesystem under a FTP server.
RemoteResources::Videos: all remote videos. We consider that anything that's not on our domain is remote.

What about Transcoding? Here's a suprising thing. Can you think of a video ever being saved without being transcoded/converted/encoded? No, no video will ever be saved without being transcoded.
module Persistence
  class Video < ActiveRecord::Base
    def save
      super
      Sidekiq.enqueue(TranscodeVideoWorker, self.id)
    end
  end
end
In other words, transcoding a video is just part of the persistence process.
In the end, you have something like:
class RemoteVideosImportation < ActionController::Base
  def create
    RemoteResources::Videos.new.import_all
    # render
  end
end
Extending features is easy, and you won't have to touch other parts of the app. Besides maintainability, you also get readability. You won't have to figure out how things are done to be able to know what is going on. Naming things correctly just decreased the cognitive load needed to analyze the source code.