msgodf/condition-systems.md

## condition-systems.md

      
    Raw
  

              condition-systems.md
            
          
    Condition Systems in an Exceptional Language by Chris Houser

I recently watched Chris Houser's talk from Clojure/conj 2015 on condition systems in Clojure. I enjoyed the subject, so I decided to write up the talk as a way of encouraging me to really understand it, and in the hope that it might help others understand it.
The last time I heard about Common Lisp's condition system was at a talk by Didier Verna at ACCU in Bristol in 2013 (slides here). It sounded really interesting, but I didn't understand it well enough.
tl;dr

Chris Houser talks about different ways of handling errors in Clojure. Based on examples from Peter Seibel's book, Practical Common Lisp, he describes condition systems, which are also known as resumable exceptions, or restarts.
Condition systems are more sophisticated than regular exception throwing and catching. They allow higher level functions to specify how errors are handled where and when they occur, rather than having the exception thrown and handled when it is too far away from the origin to do anything about it.
He first gives an example using the slingshot and swell libraries, and then using Clojure's built in dynamic binding to implement the same features.
Things that I found interesting


Condition systems
Exploring dynamic binding in Clojure

A walkthrough of the talk

Opens the talk with a quote


The action taken after detecting a software error (e.g. returning error codes) should be uniform for all components in the system. This leads to the difficult question of what action to take when an error is detected. The best action is to immediately terminate the program.

Myers, Glenford J. (1976) Software Reliability Principles & Practices
Ways to deal with errors

Ordered from dramatic, to subtle

Crash
Enter debugger
Throw exception
Return special value
Set error flag
Recover, keep going

The examples in this talk are based on those in Chapter 19 of Practical Common Lisp by Peter Seibel.
The set up

Four functions, which ordered from the highest level to the lowest level (each calling the next) are: log-analyzer > analyze-log > parse-log-file > parse-log-entry.
This is the lowest level one in our stack - it checks to see whether the input text is well formed.
(defn parse-log-entry [text]
  (if (well-formed-log-entry? text)
    {:successfully-parsed text}
    (throw (ex-info (str "Log entry was malformed; could not parse")
                    {:type ::malformed-log-entry
                     :text text}))))
This is called by parse-log-file, which opens the file and splits it into lines for processing.
(defn parse-log-file [log]
  (let [lines (with-open [stream (io/reader log)]
                 (doall (line-seq stream)))]
     (keep parse-log-entry lines)))
This includes an enclosing dynamic scope. Everything in parse-log-entry is happening inside the dynamic scope of parse-log-file. This is contrasted with the more typical lexical scoping as provided by let (and the implicit let in defn and fn).
This function is called by analyze-log, which takes a single log file and goes through all the entries.
(defn analyze-log [log]
  (doseq [entry (parse-log-file log)]
    (analyze-entry entry)))
And finally, log-analyzer, which finds all the log files and parses and analyzes each of them. It represents the rest of the application, and all the functions we described before are inside its dynamic scope.
(defn log-analyzer []
  (doseq [log (find-all-logs)]
    (analyze-log log)))
The option taken in parse-log-entry is to throw an exception, at the detection site. This bubbles up through the enclosing dynamic scope, and results in the process crashing. A single malformed log entry results in the entire process stopping.
This is why we have try and catch, so we can catch this exception in parse-log-file, and return nil when the exception has a :type of ::malformed-log-entry. The function keep calls the function on each item of lines, and discards any items for which the function evaluates to nil. So the bad entries are discarded.
(defn parse-log-file [log]
  (let [lines (with-open [stream (io/reader log)]
                 (doall (line-seq stream)))]
     (keep #(try
              (parse-log-entry %)
              (catch clojure.lang.ExceptionInfo e
                (if (= ::malformed-log-entry
                       (:type (ex-data e)))
                   nil ;; Skip bad entries
                   (throw e))))
           lines)))
Looking back at the choices of how to handle an error, this is now "Recover, and keep going".
We discover an error in parse-log-entry and we define how to skip an entry in parse-log-file. But parse-log-file is now doing the parsing, and choosing how to skip the entry. We'd like to separate this and give the control to someone else.
Use a pair of libraries called slingshot and swell (swell is built on top of slingshot).
Using slingshot and swell

The combination of these libraries provides a condition system, also known as "resumable exceptions", "resumable conditions", "resumable errors", or sometimes just "restarts".


Common Lisp
slingshot & swell


define-condition and make-condition
not needed


error and signal
throw+


handler-case
handler-bind


restart-case
restart-case


invoke-restart
invoke-restart


The point  of this is to show the close correspondence between the Common Lisp approach and that of slingshot and swell.
The first thing is to replace Clojure's built in throw with Slingshot's throw+. This is doing roughly the same thing, but doesn't require ex-info to add information about the error.
(defn parse-log-entry [text]
  (if (well-formed-log-entry? text)
    {:successfully-parsed text}
    (throw+ {:type ::malformed-log-entry, :text text}
            "Log entry was malformed; could not parse"))
Using Swell's restart-case, we can name an entry :skip-log-entry, which is a restart. And choosing to return nil.
(defn parse-log-file [log]
  (let [lines (with-open [stream (io/reader log)]
                 (doall (line-seq stream)))]
    (keep #(restart-case
              [:skip-log-entry (fn [] nil)]
              (parse-log-entry %))
          lines)))
Now up at the higher level (importantly, anywhere higher in the dynamic stack), log-analyzer can define a handler, using Swell's handler-bind function, that defines the behaviour when an error occurs:
It uses a predicate, and defines what to do, in this case to call the named restart :skip-log-entry.
(defn log-analyzer []
  (handler-bind
     [#(= ::malformed-log-entry (:type %))
      (fn [e]
        (invoke-restart :skip-log-entry))]
     (doseq [log (find-all-logs)]
       (analyze-log log))))
A higher level choosing what to do, and a lower level function choosing how to do it.
Now we have named restarts, there's other things we can do at the lower level.
:use-value some higher level code can choose to invoke that, and provide a different value to use instead of nil. Instead of skipping or throwing an exception, use this value.
:reparse-entry. The higher level function can provide a function to fix the text.
(defn parse-log-entry [text]
  (if (well-formed-log-entry? text)
    {:successfully-parsed text}
    (restart-case
      [:use-value (fn [value] value)
       :reparse-entry (fn [fixed-text] (parse-log-entry fixed-text))]
      (throw+ {:type ::malformed-log-entry :text text}
              (str "Log entry was malformed;" "could not parse")))))
[goes off on a brief segue to describe how agents handle errors]
Some existing condition system libraries


error-kit
slingshot and swell
ribol (now part of hara)
bwo/conditions

How is this possible?

All of the libraries are based on the idea described in a paper named A Syntactic Theory of Dynamic Binding by Luc Moreau. In particular the statement

we show that dynamic binding is an essential notion in semantics that can be used to define the semantics of exceptions.

This paper was the basis of an implementation of resumable exceptions in ML by Oleg Kiselyov (http://okmij.org/ftp/ML/resumable.ml), which in turn, was the inspiration for a conditions system library in Clojure (bwo/conditions).
Why not use an existing condition system library?

He describes the potential problem - what he describes as a composition problem. In your application you want to use two libraries, and one uses a particular condition system library and the other uses a different condition system library. What happens in your app if you want to handle restarts from both libraries? They both have different ways to handle it.

Most Clojure library authors know this, and choose not to burden consumers of their library with a particular condition system.

And the solution he describes, using Clojure's dynamic binding, is from the last chapter of the Joy of Clojure

Section 17.4 of Joy of Clojure is titled "Error handling and debugging"

But when using Clojure’s dynamic var binding, you can achieve a more active mode of error handling, where handlers are pushed into inner functions. In section 10.6.1, we mentioned that the binding form is used to create thread-local bindings, but its utility isn’t limited to this use case. In its purest form, dynamic scope is a structured form of a side effect (Steele 1978). You can use it to push vars down a call stack from the outer layers of a function, nesting into the inner layers

Excerpt From: Michael Fogus, Chris Houser. “The Joy of Clojure, Second Edition.”

Define a function representing an error. But you make sure you flag it as a dynamic function, a dynamic var. That one adjustment of pulling that throw out into a separate function, and marking it out as dynamic, gives you a ton of control.

(defn ^:dynamic *malformed-log-entry-error* [msg info]
  (throw (ex-info msg info)))
  
(defn parse-log-entry [text]
  (if (well-formed-log-entry? text)
    {:successfully-parsed text}
    (*malformed-log-entry-error*
      "Log entry was malformed; could not parse"
      {:text text})))
And now the log-analyzer function uses Clojure's built-in binding form. We provide a new behaviour.
(defn log-analyzer []
  (binding [*malformed-log-entry-error* (constantly nil)]
    (doseq [log (find-all-logs)]
      (analyze-log log))))
But one of the problems at this point is that log-analyzer is exploiting its knowledge of the interior function. To avoid this, we can use the same technique to name some restarts.
;; === restarts ===
(def ^:dynamic *use-value*)
(def ^:dynamic *skip-log-entry*)
(def ^:dynamic *reparse-entry*)

Calling any of these without them being defined would result in an error - just like in Common Lisp.

These restart functions can be defined using binding:
(defn parse-log-entry [text]
  (if (well-formed-log-entry? text)
    {:successfully-parsed text}
    (binding [*use-value* identity
              *reparse-entry* parse-log-entry]
       (*malformed-log-entry-error*
          "Log entry was malformed; could not parse"
          {:text text}))))
And used in the function that the log-analyzer calls to define what to happen when a malformed log entry is encountered. In this case, to use the value.
Instead of building in the knowledge of (constantly nil) we can just call the function. When parse-log-entry calls *malformed-log-entry-error*, and it in turn calls *use-value*, it uses the definition of *use-value* that's defined in parse-log-entry.
(defn log-analyzer []
  (binding [*malformed-log-entry-error*
            (fn [msg info]
              (*use-value* {:failed-to-parse (:text info)}))]
    (doseq [log (find-all-logs)]
      (analyze-log log))))
These dynamic errors can be defined hierarchically:
(defn ^:dynamic *error* [msg info]
  (throw (ex-info msg info)))
  
(defn ^:dynamic *math-error* [msg info]
  (*error* msg info))
 
(defn ^:dynamic *sqrt-of-negative* [msg info]
  (*math-error* msg info*))
  
(defn ^:dynamic *malformed-entry-error*  [msg info]
  (*error* msg info))
  
(defn ^:dynamic *malformed-log-entry-error*  [msg info]
  (*malformed-entry-error* msg info))
This allows users to bind anywhere in that tree, and catch all the errors below it in the delegation tree.
Common Lisp also provides the ability to decline to handle an error. To allow it to bubble up to the higher level functions.
This can be implemented by using a let (lexical) binding to grab the old behaviour of the error handler before the dynamic binding occurs, and then call that from within the handler if it is necessary to decline to handle the error.
(defn analyze-log [log]
  (let [decline-malformed-log-entry-error *malformed-log-entry-error*]
    (binding [*malformed-log-entry-error* (fn [msg {:as info :keys [text]}]
                                             (if (= "bad1" text)
                                                (*use-value* {:bad1-is-ok text})
                                                (decline-malformed-log-entry-error msg info)))]
      (doseq [entry (parse-log-file log)]
        (analyze-entry entry)))))
Closes the talk with a quote


Those who regularly code for fast electronic computers will have learned from bitter experience that a large fraction of the time spent in preparing calculations for the machine is taken up in removing the blunders that have been made in drawing up the programme.
With the aid of common sense and checking subroutines the majority of mistakes are quickly found and rectified. Some errors, however, are sufficiently obscure to escape detection for a surprisingly long time.

R.A. Brooker, S. Gill, and D. J. Wheeler, "The Adventures of a Blunder," Mathematical Tables and Other Aids to Computation, 6 (38), 112-113 (1952).

Step back and think about this. Do we really want complex code in obscure corners of our sad path? [...]
How many checking subroutines do we need to make sure we don't have complex errors in our complex error handling?

Conclusion


Consider alternatives to throwing exceptions
Try out the build-in condition system, knowing this technique can take you as far as you're likely to need.
Use an existing condition system where it makes sense - where you have control over the entire application and the code isn't going to be used as a library

A parting thought

Finishes by mentioning a tweet from Carin Meier.

Vast opportunities that we software developers have to make models that prove and expand human knowledge, but we're too busy plumbing.

And his thoughts about this

So, think about this, make sure you do a good job. Consider how to handle your errors, but then move on, and do something actually important. Don't squander a lot of your time thinking about condition systems.

Miscellaneous things I learned


The oops! behaviour of the Linux kernel https://en.wikipedia.org/wiki/Linux_kernel_oops
The function keep from clojure.core. This is used to lazily ignore nil values in a sequence. It takes a function, along with an optional collection. Like most Clojure functions after version 1.7, it returns a transducer if no collection is supplied.

Source from the talk

The code from the talk is in a repo on Github. The readme for that repo contains a link to a document with background research and a talk outline.
Common Lisp	slingshot & swell
`define-condition` and `make-condition`	not needed
`error` and `signal`	`throw+`
`handler-case`	`handler-bind`
`restart-case`	`restart-case`
`invoke-restart`	`invoke-restart`