Skip to content

Instantly share code, notes, and snippets.

@gekh
Last active July 19, 2023 17:07
Show Gist options
  • Save gekh/330266fc9c776a9509ecc1b0186c40f3 to your computer and use it in GitHub Desktop.
Save gekh/330266fc9c776a9509ecc1b0186c40f3 to your computer and use it in GitHub Desktop.
Damerau–Levenshtein distance [PHP]
<?php
/**
* Find Damerau–Levenshtein distance between two string
*
* @param string $source
* @param string $dest
* @return int Damerau–Levenshtein distance
*/
function distance($source, $dest)
{
if ($source == $dest) {
return 0;
}
$slen = strlen($source);
$dlen = strlen($dest);
if ($slen == 0) {
return $dlen;
} else {
if ($dlen == 0) {
return $slen;
}
}
$prevRow = range(0, $dlen);
for ($i = 0; $i < $slen; $i++) {
$thisRow = array($i + 1);
$char = $source[$i];
$prev_char = $source[$i - 1] ?? '';
for ($j = 0; $j < $dlen; $j++) {
$cost = $char == $dest[$j] ? 0 : 1;
$thisRow[$j + 1] = min(
$prevRow[$j + 1] + 1, // deletion
$thisRow[$j] + 1, // insertion
$prevRow[$j] + $cost // substitution
);
if ($i > 0 && $j > 0 && $char == $dest[$j - 1] && $prev_char == $dest[$j]) {
$thisRow[$j + 1] = min(
$thisRow[$j + 1],
$preprevRow[$j - 1] + $cost // transposition
);
}
}
$preprevRow = $prevRow;
$prevRow = $thisRow;
}
return $prevRow[$j];
}
$source = 'Helol';
$dest = 'Hello';
preg_match('#([a-zA-Z])#u', $source, $matches);
if (!empty($matches)) {
echo distance($source, $dest) / 2 . PHP_EOL . "<br>" . PHP_EOL;
echo levenshtein($source, $dest) / 2; // built-in PHP function
} else {
echo distance($source, $dest) . PHP_EOL . "<br>" . PHP_EOL;
echo levenshtein($source, $dest); // built-in PHP function
}
@gekh
Copy link
Author

gekh commented Jun 22, 2020

PHP Notice: Uninitialized string offset: -1 in /home/jeff/dl-distance-gekh.php on line 27
Perhaps line 27 should be $prev_char = $i ? $source{$i-1} : '' ;

Thank you. I've updated this code.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment