Skip to content

Instantly share code, notes, and snippets.

@suin
Last active February 10, 2016 04:59
Show Gist options
  • Save suin/1476184 to your computer and use it in GitHub Desktop.
Save suin/1476184 to your computer and use it in GitHub Desktop.
mb_levenshtein 二つの文字列のレーベンシュタイン距離を計算する(マルチバイト対応版) ref: http://qiita.com/suin/items/a0a8227addad11ff2ea7
<?php
function mb_levenshtein($string1, $string2)
{
$tokens1 = preg_split('/(?<!^)(?!$)/u', $string1);
$tokens2 = preg_split('/(?<!^)(?!$)/u', $string2);
$tokens = array_unique(array_merge($tokens1, $tokens2));
if ( count($tokens) > 26 )
{
return false;
}
$ascii = 'a';
foreach ( $tokens as $token )
{
$string1 = str_replace($token, $ascii, $string1);
$string2 = str_replace($token, $ascii, $string2);
$ascii ++;
}
$arguments = func_get_args();
$arguments[0] = $string1;
$arguments[1] = $string2;
return call_user_func_array('levenshtein', $arguments);
}
var_dump(mb_levenshtein('あとうかい', 'かとうあい')); // int(2)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment