+ Reply to Thread
Results 1 to 8 of 8

Compare words between two cells for percentage match

Hybrid View

  1. #1
    Registered User
    Join Date
    09-13-2010
    Location
    Australia
    MS-Off Ver
    Excel 2003
    Posts
    2

    Compare words between two cells for percentage match

    Hi folks,

    I have a question that is similar but different to an earlier post (April 26 to be precise - same title). I am a relative beginner with VB so it would be fantastic if someone could lend me a hand.

    I have two columns of text each with 1,000+ lines. I want a formula that generates a percentage match for the words in cells of the same row (eg cells B1 and C1).

    I have a formula (courtesy of the earlier post) that counts the number of differing characters in the two cells (see below for VBA code). What I really want is a formula that counts the number of differing words.

    Example:
    Cell A1: the cat sat on the mat
    Cell B1: the hat was squashed by the cat

    The Levenshtein formula (code and link below) gives me an output of 14 ie, there are 14 different characters in cell B1 than in cell A1. The result I am looking for is that there are 4 words used in cell B1 that are not used in cell A1.

    If there is a way to measure the change in the order of words I would be interested in that also, but I realise that is probably very complicated.

    Any help is much appreciated!

    The formula I have was posted on this forum by SpeedingLunatic in April 2010 and is a Levenshtein measure. The VBA was borrowed from here: http://en.wikibooks.org/wiki/Algorit...htein_distance and is below...

    Function levenshtein(a As String, b As String) As Integer
     
        Dim i As Integer
        Dim j As Integer
        Dim cost As Integer
        Dim d() As Integer
        Dim min1 As Integer
        Dim min2 As Integer
        Dim min3 As Integer
     
        If Len(a) = 0 Then
            levenshtein = Len(b)
            Exit Function
        End If
     
        If Len(b) = 0 Then
            levenshtein = Len(a)
            Exit Function
        End If
     
        ReDim d(Len(a), Len(b))
     
        For i = 0 To Len(a)
            d(i, 0) = i
        Next
     
        For j = 0 To Len(b)
            d(0, j) = j
        Next
     
        For i = 1 To Len(a)
            For j = 1 To Len(b)
                If Mid(a, i, 1) = Mid(b, j, 1) Then
                    cost = 0
                Else
                    cost = 1
                End If
                min1 = (d(i - 1, j) + 1)
                min2 = (d(i, j - 1) + 1)
                min3 = (d(i - 1, j - 1) + cost)
                d(i, j) = Application.WorksheetFunction.Min(min1, min2, min3)
            Next
        Next
     
        levenshtein = d(Len(a), Len(b))
     
    End Function
    Last edited by pike; 09-13-2010 at 02:27 AM. Reason: code tags for newbie MP message

  2. #2
    Forum Expert
    Join Date
    11-23-2005
    Location
    Rome
    MS-Off Ver
    Ms Office 2016
    Posts
    1,628

    Re: Compare words between two cells for percentage match

    You could use this function:
    Function levenshtein(a As String, b As String) As Integer
       Dim myWord As Variant
       
       levenshtein = 0
       For Each myWord In Split(b, " ")
          If InStr(" " & a & " ", " " & myWord & " ") = 0 Then
             levenshtein = levenshtein + 1
          End If
       Next myWord
    End Function
    Regards,
    Antonio
    Last edited by antoka05; 09-13-2010 at 02:47 AM.

  3. #3
    Forum Guru DonkeyOte's Avatar
    Join Date
    10-22-2008
    Location
    Northumberland, UK
    MS-Off Ver
    O365
    Posts
    21,531

    Re: Compare words between two cells for percentage match

    Below is similar but has case-sensitivity.

    Function MISSINGWORDS(rngS1 As Range, rngS2 As Range, Optional boolCase = False) As Long
        Dim vS1 As Variant
        Dim lngW As Long, lngTemp As Long
        Dim vbCompare As VbCompareMethod
        vbCompare = IIf(boolCase, vbBinaryCompare, vbTextCompare)
        vS1 = Split(rngS1)
        For lngW = LBound(vS1) To UBound(vS1)
            lngTemp = lngTemp - (InStr(1, " " & rngS2.Value & " ", " " & vS1(lngW) & " ", vbCompare) = 0)
        Next lngW
        MISSINGWORDS = lngTemp
    End Function
    Regards change in order - you might need to specify some examples so we can better interpret requirements.

    Note: regards both UDFs presented - both assume a consistent delimiter between words (ie no punctuation per se) - if this is not reality you will need to clean the strings before comparing.

  4. #4
    Forum Expert snb's Avatar
    Join Date
    05-09-2010
    Location
    VBA
    MS-Off Ver
    Redhat
    Posts
    5,649

    Re: Compare words between two cells for percentage match

    Sub snb_paradigm()
      For Each wd In Split(Cells(1, 1))
        x1 = x1 + UBound(Filter(Split(Cells(1,2), wd)) + 1
      Next
    End Sub
    if you also want to take the words' order into account :
    Sub snb_paradigm_extended()
      For Each wd In Split(Cells(1, 1))
         x1 = x1 + UBound(Filter(Split(Cells(1, 2)), wd)) + 1
      Next
      sq = Split(Cells(1, 1))
      For jj = 1 To UBound(sq)
        x2 = 0
        For j = 0 To UBound(sq) - jj
          c01 = ""
          For jjj = 0 To jj
            c01 = IIf(c01 = "", "", c01 & " ") & sq(j + jjj)
          Next
          x2 = x2 + UBound(Split(Cells(1, 2), c01))
        Next
        If x2 = 0 Then Exit For
        x3 = x3 + x2
      Next
    End Sub
    In this last case you can choose to weigh the different matches: a return of 4 consecutive words can be multiplied by 4, a match of 3 consecutive words can be multiplied by 3.
    It's also possible to count the different kind of matches in separate variables or array-elements.
    Last edited by snb; 09-13-2010 at 04:10 AM.



  5. #5
    Registered User
    Join Date
    09-22-2010
    Location
    london,england
    MS-Off Ver
    Excel 2003
    Posts
    2

    Re: Compare words between two cells for percentage match

    how do i use the vb script that snb posted above ? how do i insert it and where

  6. #6
    Forum Expert snb's Avatar
    Join Date
    05-09-2010
    Location
    VBA
    MS-Off Ver
    Redhat
    Posts
    5,649

    Re: Compare words between two cells for percentage match

    You can put the code in the sheet-module in which you want to count the words.
    If you put them in another module (macro/workbook/userform module) you'll have to add the sheetname the cells are in.
    NB. the preceding period: .Cells(...
    Sub snb_paradigm()
      With sheets(1)
        For Each wd In Split(.cells(1,1))
          x1 = x1 + UBound(Filter(Split(.Cells(1,2), wd)) + 1
        Next
      End With
    End Sub
    Sub snb_paradigm_extended()
      With sheets(1)
        For Each wd In Split(.Cells(1, 1))
           x1 = x1 + UBound(Filter(Split(.Cells(1, 2)), wd)) + 1
        Next
        sq = Split(.Cells(1, 1))
        For jj = 1 To UBound(sq)
          x2 = 0
          For j = 0 To UBound(sq) - jj
            c01 = ""
            For jjj = 0 To jj
              c01 = IIf(c01 = "", "", c01 & " ") & sq(j + jjj)
            Next
            x2 = x2 + UBound(Split(.Cells(1, 2), c01))
          Next
          If x2 = 0 Then Exit For
          x3 = x3 + x2
        Next
      End With
    End Sub

  7. #7
    Registered User
    Join Date
    09-13-2010
    Location
    Australia
    MS-Off Ver
    Excel 2003
    Posts
    2

    Re: Compare words between two cells for percentage match

    [SOLVED} Thanks very much to the three of you for you suggestions, and apologies that it has taken me a couple of days to get back to the forum.

    The code you supplied is perfect for my needs.

    Cheers!

+ Reply to Thread

Thread Information

Users Browsing this Thread

There are currently 1 users browsing this thread. (0 members and 1 guests)

Tags for this Thread

Bookmarks

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts

Search Engine Friendly URLs by vBSEO 3.6.0 RC 1