Skip to content

Instantly share code, notes, and snippets.

@ericnormand
Created October 4, 2019 14:25
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save ericnormand/ab3eb89a7d065dad742d9616b11e96b6 to your computer and use it in GitHub Desktop.
Save ericnormand/ab3eb89a7d065dad742d9616b11e96b6 to your computer and use it in GitHub Desktop.

exponential backoff

Our computers have real limits. Limited memory. Limited threads. Limited computations per second. Etc. We often treat our systems like they have unlimited resources, which is fine, until they get overloaded with work.

Two weeks ago, we wrote a higher order function to retry a function three times if it failed. However, if a function fails because the system is overloaded, trying again immediately is not going to help anything. In fact, it could make it worse.

A better response would be to wait a time before trying again. If it still fails, double the time and try again. Then double again, etc, until a limit is reached. This is known as exponential backoff.

Your job this week is to implement exponential backoff. Extra credit if you can tell it how long to wait before ultimately giving up.

(ns backoff.main
(:require [clojure.core.async :as async :refer [<! alt!! go-loop timeout]]))
(defn throwable? [x]
(instance? Throwable x))
(defn retry-chan [max-attempts f]
(let [interval 500] ; could make this configurable someday
(go-loop [attempts 1]
(let [result (try (f) (catch Throwable t t))]
(if (throwable? result)
(if (< attempts max-attempts)
(do
; exponential backoff with "jitter" based on https://aws.amazon.com/blogs/architecture/exponential-backoff-and-jitter/
(<! (timeout (rand (* interval (Math/pow 2 attempts)))))
(recur (inc attempts)))
::attempts-exceeded)
result)))))
(let [default-options {:attempts Integer/MAX_VALUE
:timeout 3000}]
(defn retriably [f & [options]]
(let [{:keys [attempts timeout]} (merge default-options options)]
(fn [& args]
(let [result (alt!! (retry-chan attempts #(apply f args)) ([x] x)
(async/timeout timeout) ::timed-out)]
(case result
::timed-out (throw (ex-info "timed out" {:reason ::timed-out
:timeout timeout}))
::attempts-exceeded (throw (ex-info "attempts exceeded" {:reason ::attempts-exceeded
:attempts attempts}))
result))))))
(ns backoff.main-test
(:require [backoff.main :refer [retriably]]
[clojure.test :refer [deftest is testing]]))
(deftest retriably-test
(testing "success"
(is (= 4 ((retriably #(* % %)) 2))))
(testing "attempts exceeded"
(let [bad (retriably #(/ 1 0) {:attempts 1})]
(is (thrown? clojure.lang.ExceptionInfo (bad)))
(let [data (try (bad) (catch clojure.lang.ExceptionInfo ex (ex-data ex)))]
(is (= {:reason :backoff.main/attempts-exceeded
:attempts 1}
data)))))
(testing "timed out"
(let [bad (retriably #(/ 1 0) {:timeout 1})]
(is (thrown? clojure.lang.ExceptionInfo (bad)))
(let [data (try (bad) (catch clojure.lang.ExceptionInfo ex (ex-data ex)))]
(is (= {:reason :backoff.main/timed-out
:timeout 1}
data))))))
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment