Skip to content

Instantly share code, notes, and snippets.

@amraks
Created June 19, 2017 20:48
Show Gist options
  • Save amraks/769e6be3806b27e0ff6d590f23e9e30f to your computer and use it in GitHub Desktop.
Save amraks/769e6be3806b27e0ff6d590f23e9e30f to your computer and use it in GitHub Desktop.
edit distance
MAX = 999999999
def edit_distance(s1, s2):
d = []
for i in range(len(s1) + 1):
d.append([i])
# here d --> [[0], [1], [2], [3]]
for j in range(1, len(s2) + 1):
d[0].append(j)
# here d --> [[0, 1, 2, 3], [1], [2], [3]]
# Lets call this point INIT (shown below in Explanation)
for i in range(1, len(s1) + 1):
for j in range(1, len(s2) + 1):
d[i].append(MAX)
print d
for i in range(1, len(s1) + 1):
for j in range(1, len(s2) + 1):
tmp = d[i-1][j-1] if s1[i-1] == s2[j-1] else d[i-1][j-1] + 1 # case 3
d[i][j] = min(min(d[i-1][j] + 1, d[i][j-1] + 1), tmp)
return d[len(s1)][len(s2)]
s1 = 'xap'
s2 = 'kapb'
print '%s -> %s, distance=%s' % (s1, s2, edit_distance(s1, s2))
"""
Explanation:
Consider eg. kts to ars.
INIT
s1-- 0 a r s
s2
|
0 0 1 2 3
k 1
t 2
s 3
Think in terms of last operation.
Case 1:
Suppose you converted kt to ars. Now you need to delete s from kts.
So, C1 = E(kt, ars) + 1 (delete s from kts)
Case 2:
Suppose you converted kts to ar. Now you have to add s to ar.
So, C2 = E(kts, ar) + 1 (add s to ar)
Case 3:
Suppose you converted kt to ar. Now you have two choices,
if last chars are equal (as in this case):
C3 = E(kt, ar) + 0 (no cost)
else:
C3 = E(kt, ar) + 1 (convert last char from first string to whatever last char of second string)
E(kts, ars) = min(C1, C2, C3)
C1 = E(i - 1, j) + 1
C2 = E(i, j - 1) + 1
C3 = E(i - 1, j - 1) + (0 if s1[i] == s2[j] else 1)
PS: You have to preserve destination string. So you will always perform operation on SOURCE string,
1. add char to SOURCE string
2. delete char from SOURCE string
3. substitue char from SOURCE string
"""
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment