Skip to content

Instantly share code, notes, and snippets.

@Badgerati
Created August 5, 2012 02:09
  • Star 10 You must be signed in to star a gist
  • Fork 5 You must be signed in to fork a gist
Star You must be signed in to star a gist
Save Badgerati/3261142 to your computer and use it in GitHub Desktop.
An implementation of the Levenshtein distance algorithm in LUA.
-- Returns the Levenshtein distance between the two given strings
function string.levenshtein(str1, str2)
local len1 = string.len(str1)
local len2 = string.len(str2)
local matrix = {}
local cost = 0
-- quick cut-offs to save time
if (len1 == 0) then
return len2
elseif (len2 == 0) then
return len1
elseif (str1 == str2) then
return 0
end
-- initialise the base matrix values
for i = 0, len1, 1 do
matrix[i] = {}
matrix[i][0] = i
end
for j = 0, len2, 1 do
matrix[0][j] = j
end
-- actual Levenshtein algorithm
for i = 1, len1, 1 do
for j = 1, len2, 1 do
if (str1:byte(i) == str2:byte(j)) then
cost = 0
else
cost = 1
end
matrix[i][j] = math.min(matrix[i-1][j] + 1, matrix[i][j-1] + 1, matrix[i-1][j-1] + cost)
end
end
-- return the last value - this is the Levenshtein distance
return matrix[len1][len2]
end
@ThePC007
Copy link

ThePC007 commented Jul 6, 2021

This is great, thanks. :)

@SoapHeadedSoap
Copy link

who came here from that one devforum

@gram-signal
Copy link

quick note: because in line 35 you only ever refer to matrix[i] and matrix[i-1], you can get away with just storing the two most recent lines of the matrix, rather than computing the whole thing. This gets you down to O(2n) space (while still taking O(n^2) time)

@jarble
Copy link

jarble commented Apr 23, 2022

You can use a modified version of this algorithm to search for substrings that closely match another string.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment