Levenshtein Distance
Levenshtein distance is a string metric for measuring the difference between two sequences. Informally, the Levenshtein distance between two words is the minimum number of single-character edits (i.e. insertions, deletions or substitutions) required to change one word into the other. It is named after Vladimir Levenshtein, who considered this distance in 1965.
function MIN3($a, $b, $c)
{
return (($a) < ($b) ? (($a) < ($c) ? ($a) : ($c)) : (($b) < ($c) ? ($b) : ($c)));
}
function LevenshteinDistance($s1, $s2)
{
$x; $y; $lastdiag; $olddiag;
$s1len = strlen($s1);
$s2len = strlen($s2);
$column = array();
for ($y = 1; $y <= $s1len; ++$y)
$column[$y] = $y;
for ($x = 1; $x <= $s2len; ++$x)
{
$column[0] = $x;
for ($y = 1, $lastdiag = $x - 1; $y <= $s1len; ++$y)
{
$olddiag = $column[$y];
$column[$y] = MIN3($column[$y] + 1, $column[$y - 1] + 1, ($lastdiag + ($s1[$y - 1] == $s2[$x - 1] ? 0 : 1)));
$lastdiag = $olddiag;
}
}
return $column[$s1len];
}
Example
$result = LevenshteinDistance("kitten", "sitting");
Output
result: 3