+ Reply to Thread
Results 1 to 6 of 6

Search for best match with strings

Hybrid View

  1. #1
    Forum Contributor
    Join Date
    06-05-2013
    Location
    Austria
    MS-Off Ver
    Excel 2016
    Posts
    169

    Search for best match with strings

    Hy everyone,

    I was looking for AddIns and macros to find a way to search and find the best match with strings. But no luck. Can you please help me?
    In column A are different strings. Take the first (A2) and search for the best result that matches in Column E. Then copy the best result after searching to B2
    and show the percentage of matching. Then take A3 and do the same again.
    I guess this is a big thing to do, but thanks for your help.

    Kind Regards
    Roman
    Attached Files Attached Files

  2. #2
    Administrator 6StringJazzer's Avatar
    Join Date
    01-27-2010
    Location
    Tysons Corner, VA, USA
    MS-Off Ver
    MS 365 Family 64-bit 2502
    Posts
    26,928

    Re: Search for best match with strings

    The hard part here is defining what "best match" means. Do you have a definition in mind?

    One popular method is the Levenshtein algorithm. Here is a list of threads in this forum that address this type of matching.
    Jeff
    | | |會 |會 |會 |會 | |:| | |會 |會
    Read the rules
    Use code tags to [code]enclose your code![/code]

  3. #3
    Forum Contributor
    Join Date
    06-05-2013
    Location
    Austria
    MS-Off Ver
    Excel 2016
    Posts
    169

    Re: Search for best match with strings

    Hi 6StringJazzer (great name, I love Wes Montgomery),
    I have strings and numbers in different cells. I always had to search where these strings and numbers are in my original datas. It would be great if there is something, that checks the strings and numbers I get and if there are cells in which all of these informations are in my database. Of course not always 100%. I searched a lot of different ways to match this but I had no luck...
    Thanks
    Kind Regards

  4. #4
    Forum Expert
    Join Date
    06-12-2012
    Location
    Ridgefield Park, New Jersey
    MS-Off Ver
    Excel 2003,2007,2010
    Posts
    10,241

    Re: Search for best match with strings

    FWIW this gives the results in your sample.

    Sub moosmahna()
    Dim i As Long, x As Long, y As Long, xx As String, yy As String, z As Range, zz As Range, zzz As Range
    For i = 2 To 5
        xx = Left(Cells(i, "A"), 4)
        yy = Right(Cells(i, "A"), 4)
        xxx = mid(Cells(i, "A"), 4, 4)
        Set z = Columns(5).Find(xx, LookIn:=xlValues, lookat:=xlPart)
            If Not z Is Nothing Then
                Cells(i, "B") = z.Value
            Else
               Set zz = Columns(5).Find(yy, LookIn:=xlValues, lookat:=xlPart)
                If Not zz Is Nothing Then
                    Cells(i, "B") = zz.Value
                Else
                    Set zzz = Columns(5).Find(xxx, LookIn:=xlValues, lookat:=xlPart)
                    If Not zzz Is Nothing Then
                        Cells(i, "B") = zzz.Value
                    End If
                    Set zzz = Nothing
                End If
                Set zz = Nothing
            End If
        Set z = Nothing
    Next i
    End Sub

  5. #5
    Forum Contributor
    Join Date
    06-05-2013
    Location
    Austria
    MS-Off Ver
    Excel 2016
    Posts
    169

    Re: Search for best match with strings

    Hi,
    thanks for the macro. I have tested it. It looks really good. I will test it with my original data. This will take some time but I let you know if it works.
    I saw some functions with Levenshtein
    '
     Levenshtein3 tweaked for UTLIMATE speed and CORRECT results
    ' Solution based on Longs
    ' Intermediate arrays holding Asc()make difference
    ' even Fixed length Arrays have impact on speed (small indeed)
    ' Levenshtein version 3 will return correct percentage
    '
    Function Levenshtein3(ByVal string1 As String, ByVal string2 As String) As Long
    
    Dim i As Long, j As Long, string1_length As Long, string2_length As Long
    Dim distance(0 To 60, 0 To 50) As Long, smStr1(1 To 60) As Long, smStr2(1 To 50) As Long
    Dim min1 As Long, min2 As Long, min3 As Long, minmin As Long, MaxL As Long
    
    string1_length = Len(string1):  string2_length = Len(string2)
    
    distance(0, 0) = 0
    For i = 1 To string1_length:    distance(i, 0) = i: smStr1(i) = Asc(LCase(Mid$(string1, i, 1))): Next
    For j = 1 To string2_length:    distance(0, j) = j: smStr2(j) = Asc(LCase(Mid$(string2, j, 1))): Next
    For i = 1 To string1_length
        For j = 1 To string2_length
            If smStr1(i) = smStr2(j) Then
                distance(i, j) = distance(i - 1, j - 1)
            Else
                min1 = distance(i - 1, j) + 1
                min2 = distance(i, j - 1) + 1
                min3 = distance(i - 1, j - 1) + 1
                If min2 < min1 Then
                    If min2 < min3 Then minmin = min2 Else minmin = min3
                Else
                    If min1 < min3 Then minmin = min1 Else minmin = min3
                End If
                distance(i, j) = minmin
            End If
        Next
    Next
    
    ' Levenshtein3 will properly return a percent match (100%=exact) based on similarities and Lengths etc...
    MaxL = string1_length: If string2_length > MaxL Then MaxL = string2_length
    Levenshtein3 = 100 - CLng((distance(string1_length, string2_length) * 100) / MaxL)
    
    End Function
    How can I use this in my workbook? This looks also interessting.
    Kind Regards
    Roman

  6. #6
    Forum Contributor
    Join Date
    06-05-2013
    Location
    Austria
    MS-Off Ver
    Excel 2016
    Posts
    169

    Re: Search for best match with strings

    Option Explicit
    
      Public Declare Function GetTickCount Lib "kernel32" () As Long
    
      Sub test()
      Dim s1 As String, s2 As String, lTime As Long, i As Long
      s1 = Space(100)
      s2 = String(100, "a")
      lTime = GetTickCount
      For i = 1 To 100
         LevenshteinStrings s1, s2  ' the original fn from Wikibooks and Stackoverflow
      Next
      Debug.Print GetTickCount - lTime; " ms" '  3900  ms for all diff
    
      lTime = GetTickCount
      For i = 1 To 100
         Levenshtein s1, s2
      Next
      Debug.Print GetTickCount - lTime; " ms" ' 234  ms
    
      End Sub
    
      'Option Base 0 assumed
    
      'POB: fn with byte array is 17 times faster
      Function Levenshtein(ByVal string1 As String, ByVal string2 As String) As Long
    
      Dim i As Long, j As Long, bs1() As Byte, bs2() As Byte
      Dim string1_length As Long
      Dim string2_length As Long
      Dim distance() As Long
      Dim min1 As Long, min2 As Long, min3 As Long
    
      string1_length = Len(string1)
      string2_length = Len(string2)
      ReDim distance(string1_length, string2_length)
      bs1 = string1
      bs2 = string2
    
      For i = 0 To string1_length
          distance(i, 0) = i
      Next
    
      For j = 0 To string2_length
          distance(0, j) = j
      Next
    
      For i = 1 To string1_length
          For j = 1 To string2_length
              'slow way: If Mid$(string1, i, 1) = Mid$(string2, j, 1) Then
              If bs1((i - 1) * 2) = bs2((j - 1) * 2) Then   ' *2 because Unicode every 2nd byte is 0
                  distance(i, j) = distance(i - 1, j - 1)
              Else
                  'distance(i, j) = Application.WorksheetFunction.Min _
                  (distance(i - 1, j) + 1, _
                   distance(i, j - 1) + 1, _
                   distance(i - 1, j - 1) + 1)
                  ' spell it out, 50 times faster than worksheetfunction.min
                  min1 = distance(i - 1, j) + 1
                  min2 = distance(i, j - 1) + 1
                  min3 = distance(i - 1, j - 1) + 1
                  If min1 <= min2 And min1 <= min3 Then
                      distance(i, j) = min1
                  ElseIf min2 <= min1 And min2 <= min3 Then
                      distance(i, j) = min2
                  Else
                      distance(i, j) = min3
                  End If
    
              End If
          Next
      Next
    
      Levenshtein = distance(string1_length, string2_length)
    
      End Function

+ Reply to Thread

Thread Information

Users Browsing this Thread

There are currently 1 users browsing this thread. (0 members and 1 guests)

Similar Threads

  1. Match different long strings with an array of short, summarizing strings
    By FKemps in forum Excel Formulas & Functions
    Replies: 2
    Last Post: 09-24-2015, 12:55 AM
  2. [SOLVED] search for parts of strings in another list of strings
    By marioroter in forum Excel Formulas & Functions
    Replies: 9
    Last Post: 12-17-2013, 12:28 PM
  3. excel formula to search Multiple strings in several columns and return strings
    By krratna123 in forum Excel Formulas & Functions
    Replies: 4
    Last Post: 08-13-2013, 11:20 AM
  4. Replies: 1
    Last Post: 08-13-2013, 08:32 AM
  5. Truncate text strings using LEFT and FIND/SEARCH for multiple search terms
    By ngdoherty in forum Excel Formulas & Functions
    Replies: 2
    Last Post: 04-17-2013, 07:51 PM
  6. [SOLVED] loop to match if shorter strings appears in longer strings
    By mcdermott2 in forum Excel Programming / VBA / Macros
    Replies: 3
    Last Post: 06-19-2012, 08:07 PM
  7. Trying to create a macro to match up search strings and variations in two columns....
    By beepbeep27 in forum Excel Programming / VBA / Macros
    Replies: 9
    Last Post: 05-02-2012, 05:43 PM

Bookmarks

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts

Search Engine Friendly URLs by vBSEO 3.6.0 RC 1