Skip to content

Instantly share code, notes, and snippets.

@lorenzo
Created September 6, 2018 09:53
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save lorenzo/2247f26b3c8c6df97b17a6220ddc0338 to your computer and use it in GitHub Desktop.
Save lorenzo/2247f26b3c8c6df97b17a6220ddc0338 to your computer and use it in GitHub Desktop.

StringDistance

A library to calculate a metric indicating the string distance between two strings.

This library was extracted from the Elm implementation of mailcheck http://package.elm-lang.org/packages/rluiten/mailcheck/latest.

The lcs and lcsLimit functions are more general and support more than just Char as list elements.

Functions

sift3Distance : String.String -> String.String -> Basics.Float

Calculate sift3 string distance between candidate strings.

    sift3Distance "" "abc" == 3.0
    sift3Distance "ab" "" == 2.0
    sift3Distance "abc" "abc" == 0
    sift3Distance "abc" "ab"  == 0.5

lcs : List.List a -> List.List a -> List.List a

Longest Common Subsequence

This is a simple implementation and would benefit from memoization if performance is a problem. It does not limit look ahead which can be very costly see lcsLimit for a limited look ahead version.

Warning this gets very slow very quickly with increases in list lengths even 17 character lists can cause things to bog down.

This implementation is based on [[http://rosettacode.org/wiki/Longest_common_subsequence#Haskell](http://rosettacode.org/wiki/Longest_common_subsequence#Haskell)](http://rosettacode.org/wiki/Longest_common_subsequence#Haskell)

    lcs ["a", "b", "c"] ["b", "c", "d"] == ["b", "c"]

lcsLimit : Basics.Int
    -> List.List a
    -> List.List a
    -> List.List a

Return function which returns lcs with limited look ahead.

Warning maxLookAhead quickly makes the returned function costly stay below 8 if you want responsiveness.

    lcsLimit 5 ["a", "b", "c"] ["b", "c", "d"] == ["b", "c"]


Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment