= Monkey Patching & Gorilla Engineering: Protocols In Clojure We're coming up on the release of another major milestone in Clojure, version 1.2. In general, Clojure is getting better, faster, more robust, and more reliable. But the language is still new enough that a few major features get added every iteration, and 1.2 is not an exception to that rule. The most exciting feature we're about to see hit mainstream in 1.2 is defprotocol and deftype, part of Clojure's new single-dispatch calling convention. Protocols and Types in Clojure are a way to try and solve the age old problem: how do I assocaite behavior and data in a flexible way without totally shooting myself in the foot? They offer an interesting way to structure the relationship between behavior and data; they keep a quintessentially verb-centric outlook (similar to lisp), but focus on single-dispatch on classes (like Java or Ruby). Unlike both, inheritance of objects is totally decoupled from them; objects either conform to protocols or they do not without any notion of hierarchy. And in this, Clojure developers get access to something not-unlike Ruby's infamous “monkey patching” without the danger. == Monkey Patching Before we dive into what Clojure is doing, let's review how other people have tried to solve this problem. Java and C# have gone the route of interfaces, which are basically abstract classes that you tie into the inheritance hierarchy. Python and Ruby go for a more dynamic what-you-find-is-what-you-get approach that is typical for dynamic typing, using message sending as the dispatching mechanism. This generally works, but has some serious issues. In Ruby, this is called “Monkey Patching” and it's considered fairly dangerous. For example:


class String
  def monkeypatched
    puts "Bananna!" 
  end
end
=> nil
>> "madness".monkeypatched ; prints “Bananna!”

In case that went by to fast, we re-opened the system's string class and shoved a new method on there. All subsequent strings would have the method. This is a very powerful technique that lets us extend classes that we know about, and it's pretty impressive when you do a few clever monkeypatches and suddenly your code looks 2 lines long, but... The danger of open classes and this sort of chicanery is that you can step on other people's toes. Since a monkeypatch echos around the code, conflicting monkeypatches can subtly and thoroughly hose your system. Ruby lore is full of examples of terrible consequences for monkeypatching. Every monkeypatch change has to be carefully weighed against the potential risk of other libraries that might want to modify that same behavior. Unpredictability in your base classes is generally not something to be encouraged, and so in many software shops the process is outright forbidden or taken only with extreme caution. Really, that's a shame. People want to use the technique because it can take code and make it incredibly clear and succinct. System class writers can't anticipate everything in advance, and the ability to project our verbs and behaviors onto existing classes would be a real boon. == Gorilla Engineering === Protocols Enter Clojure and the Protocols and Types. Let's start off with Protocols because they're simple. A protocol is sort of like an interface in Java or C#, but isn't associated with any inheritance. A protocol is just a list of methods tied to a name in a namespace. Here's an example of clojure code for a protocol:


(ns blogpost.bloomfilters)

(defprotocol BloomFilterable
  (make-hash [object])
  (human-readable-hash [object]))

So we're writing a library that uses Bloom Filters to test for object membership. Every object may need to write its own logic for how to hash itself suitably for a bloom filter (e.g., collections might want to hash each member individually or focus on their own identity). We start by making a protocol that describes what we'd like to see types do. This protocol is just a bundle of method signatures associated with the name “blogpost.bloomfilters.BloomFilterable”. Put that thought on hold for a minute, we'll come back to it shortly. === Types Types are the exact opposite of Protocols, they are just data, but with no required implementation. The basic syntax for types is trivial:


;; Some type examples:
(deftype MyType [three data fields])
(defrecord MyRecord [three data fields])
(deftype MyIntHolder [#^int data])

The difference between defrecord and deftype is that defrecord supports keyed access (like hashmaps in Clojure) and some basic helper methods, deftype just makes exactly what you specify. Simple, right? === Extending Protocols and Types We have types, we have protocols, let's connect our BloomFilterable to our MyIntHolder type:


(require '(blogpost [bloomfilters :as 'bloom])
; We can do it this way:
(extend-protocol BloomFilterable
  MyIntHolder
  (make-hash [this] (.x this)) ; return own data
  (human-readable-hash [this] (str (.x this)))) ; return a string

; Or perhaps like this:
(extend-type MyIntHolder
  bloom/BloomFilterable
  (make-hash [this] (.x this))
  (human-readable-hash [this] (str (.x this))))

We project the protocol onto the type, or project the type onto the protocol. Either way, we've now defined behavior. We can call (make-hash (MyIntHolder 10)) and it'll know the right thing to do. And people who use your library and want to define their own objects that can be put in the bloom filter can also project that prototype onto their objects and types. We can even project our protocols onto the System types safely!


; We can call (make-hash "hello there!")
(extend-type java.lang.String
  bloom/BloomFilterable
  (make-hash [this] (.length this)) ; Not a very good hash function!
  (human-readable-hash [this] (str (.length this))))

; And (make-hash 1)
(extend-type java.lang.Integer
  bloom/BloomFilterable
  (make-hash [this] this)
  (human-readable-hash [this] (str this)))

Since the functions we're calling in the protocol are namespaced to our blogpost.bloomfiler namespace, there isn't any risk of anyone else tripping over that name. Every other namespace could have a make-hash function each doing something totally different and a careful Clojure programmer could successfully use all of them as intended. === Performance & Safety This approach is slightly more static than the monkey patching technique we described for Ruby; there are protocols laid out in advance and you write code to them. But, in general most Ruby mixins (and even Objective-C delegates) have “informal protocols”. There is usually as set of functions that logic expects to be able to call objects to get them to at least coerce to known types. So writing them out in advance is probably a good idea. You're going to put it in your documentation anyways. For the small cost of writing out an agreed-upon contract in advance, Clojure gives you quite a bit. In terms of safety this approach is light-years ahead. And performance wise, the Clojure compiler can make smart decisions about how to make calls, making it nearly as fast to call as a direct method invocation on an Object. But most of all, this gives you controlled extensibility. Library writers can write their code to protocols and generics that library users can safely use to slide their own types into place. == There Is More I Am Not Saying Protocols and Types are cool, but they're only one of the mechanisms that Clojure provides for developers who want extensible interfaces. Clojure also provides a multiple dispatch facility with defgeneric and defmethod, along with arbitrary ad-hoc type hierarchies. While slightly less efficient than the new protocol system, they allow you to get rid of degenerate implicit logic like the Visitor Pattern. And of course all the stuff everyone glows about in Clojure is getting better with every release. Please note that 1.2 is in not-yet-released status, so deftype and defprotocol might still change a bit before launch. Fortunately the Clojure core team keeps the documentation up to date, and every core library symbol has inline documentation (type: (doc defprotocol) on the repl to get the latest info).