+ Reply to Thread
Results 1 to 8 of 8

Return text with regex find function

Hybrid View

  1. #1
    Registered User
    Join Date
    11-11-2008
    Location
    Syracuse NY
    MS-Off Ver
    2007
    Posts
    90

    Return text with regex find function

    In VBA , I'm trying to return the result of a regex find to my own sub with the following function. The problem is, I can't figure out what it's returning, or how to get it to return the actual text of the match. I also need to test if there is or isn't a match.

    The function, from this page ( http://www.tmehta.com/regexp/add_code.htm ) is reproduced here for convenience:

    Option Explicit
    #Const LateBind = True
    
    Function RegExpFind(FindIn, FindWhat As String, _
            Optional IgnoreCase As Boolean = False)
        Dim i As Long
        #If Not LateBind Then
        Dim RE As RegExp, allMatches As MatchCollection, aMatch As Match
        Set RE = New RegExp
        #Else
        Dim RE As Object, allMatches As Object, aMatch As Object
        Set RE = CreateObject("vbscript.regexp")
            #End If
        RE.Pattern = FindWhat
        RE.IgnoreCase = IgnoreCase
        RE.Global = True
        Set allMatches = RE.Execute(FindIn)
        ReDim rslt(0 To allMatches.Count - 1)
        For i = 0 To allMatches.Count - 1
            rslt(i) = allMatches(i).Value
            Next i
        RegExpFind = rslt
        End Function
    This function seems to allow for the possibility of finding more than one match at a time, which is not what I want.

    Here is an example of how I'm trying to use it. TwoWaySplitNum is a string variable.

    TwoWaySplitNum = RegExpFind(SplitTxt, "\d\d/\d\d")
    There's also a RegExpSubstitute function on that page if you're interested.
    Last edited by jrussell; 05-07-2010 at 08:29 AM.

  2. #2
    Forum Moderator Leith Ross's Avatar
    Join Date
    01-15-2005
    Location
    San Francisco, Ca
    MS-Off Ver
    2000, 2003, & 2010
    Posts
    23,259

    Re: Return text with regex find function

    Hello jrussell,

    Is this a question or a comment? You can replace substrings with RegExp. You need to provide the data you want parsed. RegExp are rather complex and the more data you provide the better.
    Sincerely,
    Leith Ross

    Remember To Do the Following....

    1. Use code tags. Place [CODE] before the first line of code and [/CODE] after the last line of code.
    2. Thank those who have helped you by clicking the Star below the post.
    3. Please mark your post [SOLVED] if it has been answered satisfactorily.


    Old Scottish Proverb...
    Luathaid gu deanamh maille! (Rushing causes delays!)

  3. #3
    Registered User
    Join Date
    11-11-2008
    Location
    Syracuse NY
    MS-Off Ver
    2007
    Posts
    90

    Re: Return text with regex find function

    Here are some examples of data I need to parse. It's percentage splits among salespeople that I need to find. See the attached workbook for more in-depth examples. If the 'desired result' columns error out on your computer, see the 'desired result text' columns.

    April Totowa/Adam Zeller 50/50
    75/25 Robin Heller/ Matthew Krason
    Mike Viera/ Mike Stein 50/50 eff: 08/01/09 E&M
    Abe Smith/John Franks/Lisa Lowe 60/20/20

    It works fine if I just enter the formulas on a worksheet. I'm just having trouble using the RegExpFind function in a macro.

    I have the logic worked out to first look for a three-way split, and if that fails, to look for a two-way split, and I can work out how to match the patterns with regex. I don't need help with that.

    I need help figuring out in VBA how to return actual matching text if there is any, and how to handle it if there isn't any.

    Here is the half-written macro where I'm trying to use this. If you copy this and the RegExpFind into a module and run it on the attached file "real sheet for upload" you'll be able to step through and follow along in the locals window. Hope this helps.


    Option Explicit
    
    Sub Splits()
    'this macro requires the RegExpFind function to work
    
    Dim LastRow As Long
    Dim i As Long
    Dim j As Long
    Dim SrcSht As Worksheet
    Dim SplitRng As Range
    Dim SplitCol As Long
    Dim SplitTxt As String
    Dim Name1 As String
    Dim Name2Temp As String
    Dim Name2 As String
    Dim Name3 As String
    Dim Pct1Char As String
    Dim Pct2Char As String
    Dim Pct3Char As String
    Dim Pct1 As Long
    Dim Pct2Temp As String
    Dim Pct2 As Long
    Dim Pct3 As Long
    Dim TwoWaySplitNum As String
    Dim TwoWaySplitPeople As String
    Dim ThreeWaySplitNum As String
    Dim ThreeWaySplitPeople As String
    
    Set SrcSht = ActiveSheet
    LastRow = SrcSht.UsedRange.Rows.Count
    
    'find "Split Info" column
    Set SplitRng = Rows("1:1").Find(What:="Split Info", After:=Range("A1"), LookIn:=xlValues, LookAt:= _
            xlWhole, SearchOrder:=xlByRows, SearchDirection:=xlNext, MatchCase:=False _
            , SearchFormat:=False)
    SplitCol = SplitRng.Column
    
    'for each row where "Split Info" column is not empty
    For i = 2 To LastRow
        If Not IsEmpty(Cells(i, SplitCol)) Then
            SplitTxt = Cells(i, SplitCol).text
            
            'find three way split [standard format] first (if the numbers dont add to 100 they are a date, ignore)
    TryAgainWithoutDate:
            On Error Resume Next
            ThreeWaySplitNum = RegExpFind(SplitTxt, "\d\d/\d\d/\d\d")
            On Error GoTo 0
            If ThreeWaySplitNum <> "" Then
                Pct1Char = RegExpFind(ThreeWaySplitNum, "^\d\d")
                Pct1 = Val(Pct1Char)
                Pct2Temp = RegExpFind(ThreeWaySplitNum, "/\d\d/")
                Pct2Char = Replace(Pct2Temp, "/", "")
                Pct2 = Val(Pct2Char)
                Pct3Char = RegExpFind(ThreeWaySplitNum, "\d\d$")
                Pct3 = Val(Pct3Char)
                'If Pct1 + Pct2 + Pct3 <> 100 Then is date - check rest of SplitText for 2 or 3 way split
                'then SplitText = SplitText without date - replace date with some other chars, or none
                If Pct1 + Pct2 + Pct3 <> 100 Then
                    SplitTxt = Replace(SplitTxt, ThreeWaySplitNum, "")
                    GoTo TryAgainWithoutDate
                End If
                
                ThreeWaySplitPeople = RegExpFind(SplitTxt, "[a-zA-Z ]+/[a-zA-Z ]+/[a-zA-Z ]+")
                Name1 = RegExpFind(ThreeWaySplitPeople, "^[a-zA-Z ]+")
                Name2Temp = RegExpFind(ThreeWaySplitPeople, "/[a-zA-Z ]+/")
                Name2 = Replace(Name2Temp, "/", "")
                Name3 = RegExpFind(ThreeWaySplitPeople, "[a-zA-Z ]+$")
            End If
            
            'find two way split [standard format]
            On Error Resume Next
            TwoWaySplitNum = RegExpFind(SplitTxt, "\d\d/\d\d")
            On Error GoTo 0
            If TwoWaySplitNum(0) <> "" Then
                Pct1Char = RegExpFind(TwoWaySplitNum(0), "^\d\d")
                Pct1 = Val(Pct1Char)
                Pct2Char = RegExpFind(TwoWaySplitNum(0), "\d\d$")
                Pct2 = Val(Pct2Char)
                
                TwoWaySplitPeople = RegExpFind(SplitTxt, "[a-zA-Z ]+/[a-zA-Z ]+")
                Name1 = RegExpFind(TwoWaySplitPeople, "^[a-zA-Z ]+")
                Name2 = RegExpFind(TwoWaySplitPeople, "[a-zA-Z ]+$")
            End If
    
    
    'trim outside spaces from names
    
    
    'if parse fails, highlight row
    
    
    'copy to result sheet, do math and lookup team where possible (need team vlookup sheet)
        
        
        
        End If
    Next i
    
    End Sub
    Last edited by jrussell; 04-23-2010 at 02:46 PM.

  4. #4
    Forum Moderator Leith Ross's Avatar
    Join Date
    01-15-2005
    Location
    San Francisco, Ca
    MS-Off Ver
    2000, 2003, & 2010
    Posts
    23,259

    Re: Return text with regex find function

    Hello jrussell,

    Sorry for the delay. I had some other unexpected business to attend to today. Here is the macro. It will size the range automatically and put the parsed data into the cells to the right of the raw (unparsed) data. This example starts at "A2" on "Sheet1". The parsed data is placed in columns "B:C". You can change this for your needs.
    'Written: April 23, 2010
    'Author:  Leith Ross
    
    Sub SplitData()
    
      Dim Cell As Range
      Dim RegExp As Object
      Dim Rng As Range
      Dim RngEnd As Range
      Dim S1 As String, S2 As String
      Dim SalesPeople As String
      Dim SplitRatio As String
      Dim T1 As Boolean, T2 As Boolean
      Dim Wks As Worksheet
      
        Set Wks = Worksheets("Sheet1")
        
       'Data to parse starts in "A2"
        Set Rng = Wks.Range("A2")
        Set RngEnd = Wks.Cells(Rows.Count, Rng.Column).End(xlUp)
        Set Rng = IIf(RngEnd.Row < Rng.Row, Rng, Wks.Range(Rng, RngEnd))
        
          Set RegExp = CreateObject("VBScript.RegExp")
          RegExp.Global = False
          RegExp.IgnoreCase = True
          
          SalesPeople = "^[a-zA-Z\/\s]+\s|\s[a-zA-Z\/\s]+\s|\s[a-zA-Z\/\s]+$"
          SplitRatio = "^[\/\d]+\s|\s[\/\d]+\s|\s[\/\d]+$"
          
          For Each Cell In Rng
            RegExp.Pattern = SalesPeople
              T1 = RegExp.Test(Cell)
              If T1 Then S1 = RegExp.Execute(Cell)(0)
            RegExp.Pattern = SplitRatio
              T2 = RegExp.Test(Cell)
              If T2 Then S2 = RegExp.Execute(Cell)(0)
            If T1 And T2 Then
               Cell.Offset(0, 1) = S1   'Sales people in column "B"
               Cell.Offset(0, 2) = S2   'Split info In column "C"
            End If
          Next Cell
       
        Set RegExp = Nothing
        
    End Sub

  5. #5
    Registered User
    Join Date
    11-11-2008
    Location
    Syracuse NY
    MS-Off Ver
    2007
    Posts
    90

    Re: Return text with regex find function

    Leith, thanks for this. It might take me some time, but I will see what I can do with it.

  6. #6
    Registered User
    Join Date
    11-11-2008
    Location
    Syracuse NY
    MS-Off Ver
    2007
    Posts
    90

    Re: Return text with regex find function

    Edit: Got past that step, still working on it though...
    Last edited by jrussell; 04-29-2010 at 10:32 AM.

  7. #7
    Registered User
    Join Date
    11-11-2008
    Location
    Syracuse NY
    MS-Off Ver
    2007
    Posts
    90

    Re: Return text with regex find function

    Hi Leith, just a note to let you know that I got this to do what I needed. I had to add a bunch of other stuff, but it works great now!

    Sub Splits()
    
    'Written: April 23, 2010
    'Author:  Leith Ross & Josh Russell
    'from http://www.excelforum.com/excel-programming/726600-return-text-with-regex-find-function.html
    'NOTE: requires a reference to the Microsoft VBScript Regular Expression type library
    
        Dim SrcSht As Worksheet
        Dim LastRow As Long
        Dim SplitRng As Range
        Dim SplitCol As Long
        Dim SalesPeople As String, SplitRatio As String, MatchThis3 As String, MatchThis4 As String, MatchThis5 As String
        Dim UnitName As String, UnitNum As String
        Dim S1 As String, S2 As String, S3 As String, S4 As String, S5 As String
        Dim T1 As Boolean, T2 As Boolean, T3 As Boolean, T4 As Boolean, T5 As Boolean
        Dim i As Long
        Dim re As New RegExp 'was Object
        Dim ma As Match
        Dim maCol As MatchCollection
        Dim MatchArrayName() As Variant
        Dim MatchArrayNum() As Variant
        Dim S2Array() As Variant
        Dim ma2 As Match
        Dim ma2Col As MatchCollection
        Dim j As Long
        Dim k As Integer
        Dim TempText As String
        Dim val1 As Long, val2 As Long, val3 As Long
        Dim ResultRow As Long
            
        Set SrcSht = ActiveSheet
        LastRow = SrcSht.UsedRange.Rows.Count
        'find "Split Info" column
        Set SplitRng = Rows("1:1").Find(What:="Split Info", After:=Range("A1"), LookIn:=xlValues, LookAt:= _
            xlWhole, SearchOrder:=xlByRows, SearchDirection:=xlNext, MatchCase:=False, SearchFormat:=False)
        SplitCol = SplitRng.Column
        ResultRow = LastRow + 3
        
        Set re = CreateObject("VBScript.RegExp")
        re.Global = False
        re.IgnoreCase = True
        
        'as Leith Ross wrote
        'SalesPeople = "^[a-zA-Z'\/\s]+\s|\s[a-zA-Z'\/\s]+\s|\s[a-zA-Z'\/\s]+$"
        'SplitRatio = "^[\/\d]+\s|\s[\/\d]+\s|\s[\/\d]+$"
        
        '(word-slash[mandatory])-word-slash 1 or more times, matching only 2 char words or longer
        SalesPeople = "\s?[a-zA-Z' ]{2,}\s?\/(\/?\s?[a-zA-Z' ]{2,}\s?)+"
        'slash-number-slash 2 or more times
        SplitRatio = "(\/?\s?\d+\s?){2,}"
        'name-number-slash format, 3 or 2 times
        MatchThis3 = "(\/?[a-zA-Z' ]{2,}\s?,?\s?\d+\s?,?\s?){2,3}"
        'name-dash-number-colon format, 3 or 2 times
        MatchThis4 = "([a-zA-Z' ]{2,}\s?,?\s?-\s?,?\s?\d+\s?,?\s?:?){2,3}"
        'number-dash-name format, 3 or 2 times
        MatchThis5 = "(\d+\s?\-\s?[a-zA-Z']{2,},?\s?){2,3}"
        
        'break it down into units, need regex patterns for each name and number
        UnitName = "[a-zA-Z' ]{2,}"
        UnitNum = "\d+"
        
        j = -1
        'for each row where "Split Info" column is not empty
        For i = 2 To LastRow
            If IsEmpty(Cells(i, SplitCol)) Then GoTo ResetLine
                re.Global = False
                re.Pattern = SalesPeople
                    T1 = re.Test(Cells(i, SplitCol))
                    If T1 Then
                        S1 = re.Execute(Cells(i, SplitCol))(0) 'S1 = names div by slashes
                        re.Global = True
                        j = -1
                        re.Pattern = UnitName
                        Set maCol = re.Execute(S1)
                        ReDim MatchArrayName(1, maCol.Count - 1) As Variant
                        For Each ma In maCol
                            j = j + 1
                            MatchArrayName(0, j) = ma.Value
                            MatchArrayName(1, j) = ma.FirstIndex
                        Next
                    End If
                TempText = Cells(i, SplitCol).text
    TryAgainWithoutDate:
                re.Global = False
                re.Pattern = SplitRatio
                    T2 = re.Test(TempText)
                    If T2 Then
                        Set ma2Col = re.Execute(TempText) 'S2 = numbers div by slashes 'removing(0) fixed type mismatch
                        ReDim S2Array(1, 0) As Variant
                        For Each ma2 In ma2Col
                            S2Array(0, 0) = ma2.Value
                            S2Array(1, 0) = ma2.FirstIndex
                        Next
                        
                        re.Global = True
                        j = -1
                        re.Pattern = UnitNum
                        Set maCol = re.Execute(S2Array(0, 0)) 'original
                        ReDim MatchArrayNum(1, maCol.Count - 1) As Variant
                        For Each ma In maCol
                            j = j + 1
                            MatchArrayNum(0, j) = ma.Value
                            MatchArrayNum(1, j) = ma.FirstIndex
                        Next
                        If j = 2 Then
                            val1 = Val(MatchArrayNum(0, 0))
                            val2 = Val(MatchArrayNum(0, 1))
                            val3 = Val(MatchArrayNum(0, 2))
                            If val1 + val2 + val3 <> 100 Then
                                k = S2Array(1, 0) + 1
                                TempText = WorksheetFunction.Replace(TempText, k, 8, "________") 'figure out where S2 is in larger thing, get new S2
                                GoTo TryAgainWithoutDate
                            End If
                        End If
                    End If
                re.Global = False
                re.Pattern = MatchThis3
                    T3 = re.Test(Cells(i, SplitCol))
                    If T3 Then
                        S3 = re.Execute(Cells(i, SplitCol))(0) 'S3 = name-number-slash format, 3 or 2 times
                        re.Global = True
                        j = -1
                        re.Pattern = UnitName
                        Set maCol = re.Execute(S3)
                        ReDim MatchArrayName(1, maCol.Count - 1) As Variant
                        For Each ma In maCol
                            j = j + 1
                            MatchArrayName(0, j) = ma.Value
                            MatchArrayName(1, j) = ma.FirstIndex
                        Next
                        j = -1
                        re.Pattern = UnitNum
                        Set maCol = re.Execute(S3)
                        ReDim MatchArrayNum(1, maCol.Count - 1) As Variant
                        For Each ma In maCol
                            j = j + 1
                            MatchArrayNum(0, j) = ma.Value
                            MatchArrayNum(1, j) = ma.FirstIndex
                        Next
                        GoTo FoundResults
                    End If
                re.Global = False
                re.Pattern = MatchThis4
                    T4 = re.Test(Cells(i, SplitCol))
                    If T4 Then
                        S4 = re.Execute(Cells(i, SplitCol))(0) 'S4 = name-dash-number-colon format, 3 or 2 times
                        re.Global = True
                        j = -1
                        re.Pattern = UnitName
                        Set maCol = re.Execute(S4)
                        ReDim MatchArrayName(1, maCol.Count - 1) As Variant
                        For Each ma In maCol
                            j = j + 1
                            MatchArrayName(0, j) = ma.Value
                            MatchArrayName(1, j) = ma.FirstIndex
                        Next
                        j = -1
                        re.Pattern = UnitNum
                        Set maCol = re.Execute(S4)
                        ReDim MatchArrayNum(1, maCol.Count - 1) As Variant
                        For Each ma In maCol
                            j = j + 1
                            MatchArrayNum(0, j) = ma.Value
                            MatchArrayNum(1, j) = ma.FirstIndex
                        Next
                        GoTo FoundResults
                    End If
                re.Global = False
                re.Pattern = MatchThis5
                    T5 = re.Test(Cells(i, SplitCol))
                    If T5 Then
                        S5 = re.Execute(Cells(i, SplitCol))(0) 'S5 = number-dash-name format, 3 or 2 times
                        re.Global = True
                        j = -1
                        re.Pattern = UnitName
                        Set maCol = re.Execute(S5)
                        ReDim MatchArrayName(1, maCol.Count - 1) As Variant
                        For Each ma In maCol
                            j = j + 1
                            MatchArrayName(0, j) = ma.Value
                            MatchArrayName(1, j) = ma.FirstIndex
                        Next
                        j = -1
                        re.Pattern = UnitNum
                        Set maCol = re.Execute(S5)
                        ReDim MatchArrayNum(1, maCol.Count - 1) As Variant
                        For Each ma In maCol
                            j = j + 1
                            MatchArrayNum(0, j) = ma.Value
                            MatchArrayNum(1, j) = ma.FirstIndex
                        Next
                        GoTo FoundResults
                    End If
    Text too long... continued in next post.

  8. #8
    Registered User
    Join Date
    11-11-2008
    Location
    Syracuse NY
    MS-Off Ver
    2007
    Posts
    90

    Re: Return text with regex find function

            
    FoundResults:
            '---------------------------------------------------------------
            'instead of this, need to:
            'copy whole rows 2 or 3 times to bottom, put name in col F, look up team col E,
            'multiply SALE/COST/GP by percentage in cols K,L,M
    '        Cells(i, SplitCol).Offset(0, 1) = S1
    '        On Error Resume Next
    '        Cells(i, SplitCol).Offset(0, 2) = S2Array(0, 0)
    '        On Error GoTo 0
    '        Cells(i, SplitCol).Offset(0, 3) = S3
    '        Cells(i, SplitCol).Offset(0, 4) = S4
    '        Cells(i, SplitCol).Offset(0, 5) = S5
    '        On Error Resume Next
    '        'final answers in MatchArrayName and MatchArrayNum
    '        Cells(i, SplitCol).Offset(0, 6) = WorksheetFunction.Trim(MatchArrayName(0, 0))
    '        Cells(i, SplitCol).Offset(0, 7) = WorksheetFunction.Trim(MatchArrayName(0, 1))
    '        Cells(i, SplitCol).Offset(0, 8) = WorksheetFunction.Trim(MatchArrayName(0, 2))
    '        Cells(i, SplitCol).Offset(0, 9) = WorksheetFunction.Trim(MatchArrayNum(0, 0))
    '        Cells(i, SplitCol).Offset(0, 10) = WorksheetFunction.Trim(MatchArrayNum(0, 1))
    '        Cells(i, SplitCol).Offset(0, 11) = WorksheetFunction.Trim(MatchArrayNum(0, 2))
    '        On Error GoTo 0
            '---------------------------------------------------------------
            
            '---------------------------------------------------------------
            If j > 0 Then
                '2 way split
                Rows(i).Copy Destination:=ActiveSheet.Cells(ResultRow, 1)
                Cells(ResultRow, 6).Value = WorksheetFunction.Trim(MatchArrayName(0, 0))
                Cells(ResultRow, 5).Formula = "=VLOOKUP(""" & Cells(ResultRow, 6) & """,'rep names lookup'!A:C,3,0)"
                Cells(ResultRow, 11).Formula = "=" & Cells(i, 11) & "*" & (Val(WorksheetFunction.Trim(MatchArrayNum(0, 0))) / 100)
                Cells(ResultRow, 12).Formula = "=" & Cells(i, 12) & "*" & (Val(WorksheetFunction.Trim(MatchArrayNum(0, 0))) / 100)
                ResultRow = ResultRow + 1
                Rows(i).Copy Destination:=ActiveSheet.Cells(ResultRow, 1)
                Cells(ResultRow, 6).Value = WorksheetFunction.Trim(MatchArrayName(0, 1))
                Cells(ResultRow, 5).Formula = "=VLOOKUP(""" & Cells(ResultRow, 6) & """,'rep names lookup'!A:C,3,0)"
                Cells(ResultRow, 11).Formula = "=" & Cells(i, 11) & "*" & (Val(WorksheetFunction.Trim(MatchArrayNum(0, 1))) / 100)
                Cells(ResultRow, 12).Formula = "=" & Cells(i, 12) & "*" & (Val(WorksheetFunction.Trim(MatchArrayNum(0, 1))) / 100)
                ResultRow = ResultRow + 1
            End If
            If j = 2 Then
                '3 way split
                Rows(i).Copy Destination:=ActiveSheet.Cells(ResultRow, 1)
                Cells(ResultRow, 6).Value = WorksheetFunction.Trim(MatchArrayName(0, 2))
                Cells(ResultRow, 5).Formula = "=VLOOKUP(""" & Cells(ResultRow, 6) & """,'rep names lookup'!A:C,3,0)"
                Cells(ResultRow, 11).Formula = "=" & Cells(i, 11) & "*" & (Val(WorksheetFunction.Trim(MatchArrayNum(0, 2))) / 100)
                Cells(ResultRow, 12).Formula = "=" & Cells(i, 12) & "*" & (Val(WorksheetFunction.Trim(MatchArrayNum(0, 2))) / 100)
                ResultRow = ResultRow + 1
            End If
            ResultRow = ResultRow + 1
            '---------------------------------------------------------------
            
    ResetLine:
            'reset everything
            T1 = False
            T2 = False
            T3 = False
            T4 = False
            T5 = False
            S1 = ""
            S2 = ""
            S3 = ""
            S4 = ""
            S5 = ""
        Next
       
        Set re = Nothing
        
        Columns("E:E").Copy
        Columns("E:E").PasteSpecial Paste:=xlPasteValues, Operation:=xlNone, SkipBlanks:=False, Transpose:=False
        Application.CutCopyMode = False
        
    End Sub

+ Reply to Thread

Thread Information

Users Browsing this Thread

There are currently 1 users browsing this thread. (0 members and 1 guests)

Tags for this Thread

Bookmarks

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts

Search Engine Friendly URLs by vBSEO 3.6.0 RC 1