Skip to content

Instantly share code, notes, and snippets.

@danlei
Created May 25, 2011 04:05
Show Gist options
  • Save danlei/990306 to your computer and use it in GitHub Desktop.
Save danlei/990306 to your computer and use it in GitHub Desktop.
Simple Haskell implementation of the soundex algorithm. difference :: String -> String -> Int yields "phonetic distance" between words as an integer in range [0..4].
module Soundex
where
import Data.Maybe (catMaybes)
import GHC.Exts (the)
import Data.List (group)
import Data.Function (on)
classes = [("bfpv",'1'),
("cgjkqsxzß",'2'),
("dt",'3'),
("l",'4'),
("mn",'5'),
("r",'6'),
("aeiouäöüy",'0')]
classOf :: Char -> Maybe Char
classOf c = classOf' c classes
where
classOf' _ [] = Nothing
classOf' c ((cs,cls):clss)
| c `notElem` cs = classOf' c clss
| otherwise = Just cls
encode :: String -> String
encode (c:cs) = c : '-' : take' 3 (encode' cs)
where
encode' = filter (/='0') . map the . group . catMaybes . map classOf
take' n xs = take n (xs ++ repeat '0')
difference :: String -> String -> Int
difference w1 w2 = length . filter (uncurry (/=)) $ (zip `on` encode) w1 w2
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment