Skip to content

Instantly share code, notes, and snippets.

@ademar111190
Last active March 5, 2024 10:53
Show Gist options
  • Save ademar111190/34d3de41308389a0d0d8 to your computer and use it in GitHub Desktop.
Save ademar111190/34d3de41308389a0d0d8 to your computer and use it in GitHub Desktop.
Levenshtein Distance algorithm implementation using Kotlin
import kotlin.math.min
fun levenshtein(lhs : CharSequence, rhs : CharSequence) : Int {
if(lhs == rhs) { return 0 }
if(lhs.isEmpty()) { return rhs.length }
if(rhs.isEmpty()) { return lhs.length }
val lhsLength = lhs.length + 1
val rhsLength = rhs.length + 1
var cost = Array(lhsLength) { it }
var newCost = Array(lhsLength) { 0 }
for (i in 1..rhsLength-1) {
newCost[0] = i
for (j in 1..lhsLength-1) {
val match = if(lhs[j - 1] == rhs[i - 1]) 0 else 1
val costReplace = cost[j - 1] + match
val costInsert = cost[j] + 1
val costDelete = newCost[j - 1] + 1
newCost[j] = min(min(costInsert, costDelete), costReplace)
}
val swap = cost
cost = newCost
newCost = swap
}
return cost[lhsLength - 1]
}
@dankovision
Copy link

Nice code, been trying to use it in a homebrew quiz app. Noticed some oddities though, e.g. levenshtein("canada","canad") returns 6 instead of 1. Conversely, levenshtein("canad","canada") returns 0. Haven't wrapped my brain around it enough yet to figure out why...

@ademar111190
Copy link
Author

@dankovision maybe some env issue? I have run here https://pl.kotl.in/b6_FH8CaJ and I got 1 for both cases

@dankovision
Copy link

@dankovision maybe some env issue? I have run here https://pl.kotl.in/b6_FH8CaJ and I got 1 for both cases

Cheers for replying, finally got around to having a look. After much poking around, it was due to me accepting the Android Studio IDE suggestion of replacing the .. operator in the for loops with 'until', e.g.

Original code: for (i in 1..rhsLength - 1)
Suggested replacement: for (i in 1 until rhsLength)

Except I thought I knew better and added the '- 1' back on to the end of rhsLength, meaning the iterations were 1 short (until stops at and doesn't execute the last in the range, '..' stops after the end of the range). Ah well, learnt something today, don't think you know better than the compiler/IDE!

@dankovision
Copy link

And I should add thanks for the code, works a charm now!

@ademar111190
Copy link
Author

@dankovision hahaha nice, good to know :)

@Wavesonics
Copy link

If anyone is interested in some quick and dirty tests so you don't lower your test coverage when you copy paste this sum-bich in:

class LevenshteinDistanceTest {
	@Test
	fun `Distance Test`() {
		testDistance("", "", 0)
		testDistance("1", "1", 0)
		testDistance("1", "2", 1)
		testDistance("12", "12", 0)
		testDistance("123", "12", 1)
		testDistance("1234", "1", 3)
		testDistance("1234", "1233", 1)
		testDistance("", "12345", 5)
		testDistance("kitten", "mittens", 2)
		testDistance("canada", "canad", 1)
		testDistance("canad", "canada", 1)
	}

	private fun testDistance(a: String, b: String, expectedDistance: Int) {
		val d = levenshteinDistance(a, b)
		assertEquals(expectedDistance, d, "Distance did not match for `$a` and `$b`")
	}
}

@Den-Rimus
Copy link

Den-Rimus commented Feb 16, 2024

I used the code for Levenshtein distance method from above but also asked ChatGPT to write me one. Now I'm thinking which one should I pick 🤔 😅

fun levenshteinDistanceByGpt(s1: String, s2: String): Int {
    val m = s1.length
    val n = s2.length

    // Create a 2D array to store the distances
    val dp = Array(m + 1) { IntArray(n + 1) }

    // Initialize the first row and column
    for (i in 0..m) {
        dp[i][0] = i
    }
    for (j in 0..n) {
        dp[0][j] = j
    }

    // Fill in the rest of the array
    for (i in 1..m) {
        for (j in 1..n) {
            val cost = if (s1[i - 1] == s2[j - 1]) 0 else 1
            dp[i][j] = minOf(dp[i - 1][j] + 1, dp[i][j - 1] + 1, dp[i - 1][j - 1] + cost)
        }
    }

    // Return the Levenshtein distance
    return dp[m][n]
}

The human-written one is (more) readable.

And both pass the test from @Wavesonics (thanks for that one! 👍 ).

@glureau
Copy link

glureau commented Feb 19, 2024

Use the GPT one so then you can include "AI" in your app name and win $$$

Seriously you can benchmark them if you want 😊 I presume the human one is better for RAM as it's only 2 arrays instead of a 2d arrays, but yours should be better for cpu I guess?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment