Skip to content

Instantly share code, notes, and snippets.

@ship561
Created March 3, 2016 22:51
Show Gist options
  • Save ship561/cbc8d099348a00a38b43 to your computer and use it in GitHub Desktop.
Save ship561/cbc8d099348a00a38b43 to your computer and use it in GitHub Desktop.
Levenshtein distance that short circuits
(defn levenshtein [thr str1 str2]
"Clojure Levenshtein distance which short circuits when the distance is > thr"
(let [n (count str1)
m (count str2)]
(cond
(= 0 n) m
(= 0 m) n
:else
(last
(reduce (fn [prev-col i]
(let [nextrow (reduce (fn [col j]
(let [lev (fn [i j]
(min (inc (nth col j))
(inc (nth prev-col (inc j)))
(+ (get prev-col j)
(if (= (nth str1 i) (nth str2 j)) 0 1))))]
(conj col (lev i j)))) ; update col[1..m]
[(inc i)]
(range m))]
(if (<= (apply min nextrow) thr)
nextrow
(reduced [:big]))))
(vec (range (inc m)))
(vec (range n)))) ; initialization for the first column.
)))
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment