Hi all,

I am after a function/VBA to compare text. It will be comparing across 2 cells (e.g. A1 and B1) and text responses will rarely be more than one/two words, so the comparison should be based on characters, not words. Also, I need it to output a % of similarity: 100% being a perfect match, 0% being not at all comparable. Another important point, which is what I am yet to figure out with my current working solution is how to make it Not case sensitive, below solution gives me pretty much what I want except this point. e.g. at the moment it is return "Love" vs. "LOVE" as only 25% similar.

Also, I wanted to know if there is a way to make it default to 100% (or weight it to a higher percentage) if all the characters are used from column B and are in the same order i.e. If in column B the entry is "Apple", and in Column A the response is "Apple iPhone".

I am using a standard levenshtein function to do this at the moment, however, I was wondering if there is anything more comprehensive out there! Any help/advice would be appreciated!

Excel formula: =(MAX(LEN(A1), LEN(B1)) - levenshtein(A1,B1))/MAX(LEN(A1), LEN(B1))

Function levenshtein(a As String, b As String) As Integer

Dim i As Integer
Dim j As Integer
Dim cost As Integer
Dim d() As Integer
Dim min1 As Integer
Dim min2 As Integer
Dim min3 As Integer

If Len(a) = 0 Then
levenshtein = Len(b)
Exit Function
End If

If Len(b) = 0 Then
levenshtein = Len(a)
Exit Function
End If

ReDim d(Len(a), Len(b))

For i = 0 To Len(a)
d(i, 0) = i
Next

For j = 0 To Len(b)
d(0, j) = j
Next

For i = 1 To Len(a)
For j = 1 To Len(b)
If Mid(a, i, 1) = Mid(b, j, 1) Then
cost = 0
Else
cost = 1
End If
min1 = (d(i - 1, j) + 1)
min2 = (d(i, j - 1) + 1)
min3 = (d(i - 1, j - 1) + cost)
d(i, j) = Application.WorksheetFunction.Min(min1, min2, min3)
Next
Next

levenshtein = d(Len(a), Len(b))

End Function