I am trying to develop a program where the end state will be this program finding all instances of similar text within cells from multiple sheets and grouping it together. This question is going to deal specifically with the sub process I am trying to make dealing with the comparison portion. I created the following code which is simply a test to see how my comparison could work. It compares two cells right next to each other and goes letter by letter and spits out a similarity score. The code is below
Option Explicit
Sub ComparisonTest()
Dim arr As Variant
Dim columncount As Integer
Dim rowcount As Long
Sheets(1).Select
arr = ActiveSheet.Range("A1").CurrentRegion
columncount = UBound(arr, 2)
rowcount = UBound(arr, 1)
Dim N As Long
Dim Letter As Long
Dim test1_string As Variant
Dim test2_string As Variant
Dim Test1_Comp As String
Dim Test2_Comp As String
Dim similarity As Integer
Dim lettercount As Integer
Dim lettercount2 As Integer
'initialize count
'construct loop from top to bottom and left to right
For N = 1 To rowcount
lettercount = Len(Cells(N, 1).Value)
lettercount2 = Len(Cells(N, 2).Value)
If lettercount2 > lettercount Then
lettercount = lettercount2
Else
End If
test1_string = CStr(Cells(N, 1))
test2_string = CStr(Cells(N, 2))
For Letter = 1 To Len(Cells(N, 1).Value)
Test1_Comp = Mid(test1_string, Letter, 1)
Test2_Comp = Mid(test2_string, Letter, 1)
If Test1_Comp = Test2_Comp Then
similarity = similarity + 1
Else
End If
Next Letter
If similarity > (lettercount * 0.9) Then
Cells(N, 3).Value = (similarity / lettercount)
Cells(N, 3).Interior.Color = vbGreen
Else
Cells(N, 3).Value = (similarity / lettercount)
Cells(N, 3).Interior.Color = vbRed
End If
similarity = 0
Next N
End Sub
The problem with this code is the results are not what I am wanting. For example if I have entries "This is a test string" compared to "Ths is a text string" due to my macro comparing value by value a mistake in the beginning makes the similarity score very low when in actuality the only error is a missing "i" Does anyone have suggestions for this code to compare word by word instead of value by value? I use the word value here because there will be instances where the "text" will include numbers such as 90% for example.
Bookmarks