Skip to content

Instantly share code, notes, and snippets.

@KartikTalwar
Created March 25, 2012 21:01
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save KartikTalwar/2199726 to your computer and use it in GitHub Desktop.
Save KartikTalwar/2199726 to your computer and use it in GitHub Desktop.
Damerau-Levenshtein distance
def dlevenshtein(seq1, seq2):
"""
This implementation is O(N*M) time and O(M) space, for N and M the
lengths of the two sequences.
# codesnippet:D0DE4716-B6E6-4161-9219-2903BF8F547F
"""
oneago = None
thisrow = range(1, len(seq2) + 1) + [0]
for x in xrange(len(seq1)):
twoago, oneago, thisrow = oneago, thisrow, [0] * len(seq2) + [x + 1]
for y in xrange(len(seq2)):
delcost = oneago[y] + 1
addcost = thisrow[y - 1] + 1
subcost = oneago[y - 1] + (seq1[x] != seq2[y])
thisrow[y] = min(delcost, addcost, subcost)
# This block deals with transpositions
if (x > 0 and y > 0 and seq1[x] == seq2[y - 1]
and seq1[x-1] == seq2[y] and seq1[x] != seq2[y]):
thisrow[y] = min(thisrow[y], twoago[y - 2] + 1)
return thisrow[len(seq2) - 1]
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment