Skip to content

Instantly share code, notes, and snippets.

@jberryman
Created March 21, 2012 16:03
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 1 You must be signed in to fork a gist
  • Save jberryman/2149007 to your computer and use it in GitHub Desktop.
Save jberryman/2149007 to your computer and use it in GitHub Desktop.
an optimized haskell Levenshtein distance, using the vector library
-- an optimized minimum edit distance function, using the 'vector' library.
-- By making careful use of the stream-fusing functions in Vector, we can
-- write an implementation that looks like idiomatic list-processing code
-- but runs like tight mutable array code.
import qualified Data.Vector.Unboxed.Safe as V
-- we use suffixes _W, _SW, etc. to indicate the cell to the west, southwest,
-- and so on in the more traditional matrix-based approach.
med :: String -> String -> Int
med s1 = V.last . V.ifoldl' scanS1 costs_i . V.fromList
where vs1 = V.fromList s1
costs_i = V.enumFromN 1 $ V.length vs1 -- [0.. length s1]
scanS1 costs_W costSW_i c2 =
let v = V.zip vs1 costs_W
v' = V.postscanl' scanVec (costSW_i, costSW_i + 1) v
scanVec (costSW, costS) (c1, costW) =
(costW, min (min costS costW + 1)
(costSW + if c1 == c2 then 0 else 2))
in snd $ V.unzip v'
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment