Skip to content

Instantly share code, notes, and snippets.

@shibukawa
Created April 19, 2012 15:38
Show Gist options
  • Save shibukawa/2421789 to your computer and use it in GitHub Desktop.
Save shibukawa/2421789 to your computer and use it in GitHub Desktop.
Arranged Damerau–Levenshtein distance
module Distance
module_function
def calc_weight(index, length)
distance = (length - 1.0 - index) / (length - 1.0)
distance * distance * 0.5 + 1.0
end
def strings_distance(source, target, distance_limit)
if source == target
return 0
end
m = source.length
n = target.length
if (m - n).abs > distance_limit
return 100
end
inf = m + n
h = Array.new(m + 2) { |i|
a = Array.new(n + 2, 0)
a[0] = inf
a[1] = i
a
}
h1 = h[1]
h0 = h[0]
(0..n).each do |i|
h1[i + 1] = i
h0[i + 1] = inf
end
sd = {}
(1..m).each do |i|
db = 0
hi = h[i]
hi1 = h[i + 1]
weight = calc_weight(i - 1, m)
c1 = source[i - 1].downcase
(1..n).each do |j|
i1 = sd[target[j - 1]] || 0
j1 = db
c2 = target[j - 1]
if c1 == c2.downcase
hi1[j + 1] = hi[j]
db = j
else
hi1[j + 1] = [hi[j], hi1[j], hi[j + 1]].min + 1 * weight
end
hi1[j + 1] = [hi1[j + 1], h[i1][j1] + ((i - i1 - 1) + 1 + (j - j1 - 1)) * weight].min
end
sd[source[i - 1]] = i
end
h[m + 1][n + 1]
end
end
def test(source, target)
p source + " <--> " + target + " = " + Distance.strings_distance(source, target, 3).to_s
end
test("hello", "hellow")
test("hello", "helol")
test("hello", "helo")
test("hello", "helo")
test("hello", "Hello")
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment