Levenshtein Distance
The algorithm that measures how different two strings are
What is Levenshtein Distance?
The Levenshtein distance is a string metric for measuring the difference between two sequences. Informally, it's the minimum number of single-character edits (insertions, deletions, or substitutions) required to change one word into another.
Example:
Distance between "kitten" and "sitting" is 3:
- kitten → sitten (substitute 'k' with 's')
- sitten → sittin (substitute 'e' with 'i')
- sittin → sitting (insert 'g' at end)
How the Algorithm Works
The Levenshtein distance is computed using dynamic programming. We build a matrix where each cell represents the distance between substrings.
Step-by-Step Process
- 1 Create a matrix with (m+1) rows and (n+1) columns where m,n are lengths of the strings
- 2 Initialize first row with 0 to m and first column with 0 to n
-
3
For each character comparison, fill the matrix cell with the minimum of:
- Top cell + 1 (deletion)
- Left cell + 1 (insertion)
- Diagonal cell + cost (substitution, where cost is 0 if same, 1 if different)
- 4 The bottom-right cell contains the final distance
Interactive Matrix
Applications
Spell Checking
Suggest corrections by finding dictionary words with smallest distance to misspelled words
DNA Analysis
Measure similarity between genetic sequences to study mutations and evolution
Fuzzy Search
Find approximate matches in databases when exact matching isn't possible
Implementation in JavaScript
function levenshteinDistance(a, b) {
const matrix = [];
// Initialize matrix
for (let i = 0; i <= b.length; i++) {
matrix[i] = [i];
}
for (let j = 0; j <= a.length; j++) {
matrix[0][j] = j;
}
// Fill the matrix
for (let i = 1; i <= b.length; i++) {
for (let j = 1; j <= a.length; j++) {
const cost = a[j-1] === b[i-1] ? 0 : 1;
matrix[i][j] = Math.min(
matrix[i-1][j] + 1, // deletion
matrix[i][j-1] + 1, // insertion
matrix[i-1][j-1] + cost // substitution
);
}
}
return matrix[b.length][a.length];
}