+ Reply to Thread
Results 1 to 6 of 6

How to filter / sort data column using only partial match - regexp

  1. #1
    Registered User
    Join Date
    04-16-2013
    Location
    Ireland
    MS-Off Ver
    Excel 2010
    Posts
    4

    How to filter / sort data column using only partial match - regexp

    Hi,

    I am using Excel 2010. I am a novice user.

    I have a lot of data to filter / sort.
    I want to initially to create a filter for a column of data - which has the format similar to hierarchical paths to files.
    The data is a mix of text/numbers. e.g.

    pathA/path_X/path_Y/path_Z/lso0_rxs_reg_254__5_0/d
    pathA/path_X/path_Y/path_Z/lso0_rxs_reg_253__5_0/d
    pathA/path_123/path_456/data_out_reg_17_0/d
    pathA/path_123/path_456/data_out_reg_0_0/d
    pathA/path_X/path_Y/path_Z/lso0_rxs_reg_255__5_0/d
    pathA/path_X/path_Y/path_Z/path_D1/lfc_out_rxs_reg_4_0/d
    pathA/path_123/path_456/data_out_reg_4_0/d
    pathA/path_X/path__AA/path__AB/secure_push_data_rxs_reg_52_0/d
    pathA/path_X/path_Y/path_Z/path_D1/lfc_out_rxs_reg_1_0/d
    pathA/path_X/path__AA/path__AB/push_byte_offset_rxs_reg_2_0/d
    pathA/path_X/path_Y/path_Z/path_D1/lfc_out_rxs_reg_2_0/d
    pathA/path_X/path_Y/path_Z/lso0_rxs_reg_237__6_0/d
    pathA/path_X/path_Y/path_Z/path_D1/calc_lcode_rxs_reg_13_0/d
    pathA/path_X/path__AA/path__AB/ssm_ath1_push_data_rxs_reg_52_0/d
    pathA/path_X/path_Y/path_Z/path_D1/calc_lcode_rxs_reg_6_0/d
    pathA/path_X/path__AA/path__AB/ssm_pke3_push_data_rxs_reg_26_0/d
    pathA/path_X/path__AA/path__AB/ssm_pke4_push_data_rxs_reg_26_0/d
    pathA/path_X/path_Y/path_Z/lsymb0_1_0_rxs_reg_3_0/d
    pathA/path_X/path_Y/path_Z/lso0_rxs_reg_230__6_0/d
    pathA/path_X/path_Y/path_Z/path_D1/calc_lcode_rxs_reg_12_0/d

    Doing an alphabetical sort of this date would return the following order.
    As you can see while each strings in unique - there are many instances where they are simialr - if you ignore the unique numberic values at the end of the string.

    pathA/path_123/path_456/data_out_reg_0_0/d
    pathA/path_123/path_456/data_out_reg_17_0/d
    pathA/path_123/path_456/data_out_reg_4_0/d
    pathA/path_X/path_Y/path_Z/lso0_rxs_reg_230__6_0/d
    pathA/path_X/path_Y/path_Z/lso0_rxs_reg_237__6_0/d
    pathA/path_X/path_Y/path_Z/lso0_rxs_reg_253__5_0/d
    pathA/path_X/path_Y/path_Z/lso0_rxs_reg_254__5_0/d
    pathA/path_X/path_Y/path_Z/lso0_rxs_reg_255__5_0/d
    pathA/path_X/path_Y/path_Z/lsymb0_1_0_rxs_reg_3_0/d
    pathA/path_X/path_Y/path_Z/path_D1/calc_lcode_rxs_reg_12_0/d
    pathA/path_X/path_Y/path_Z/path_D1/calc_lcode_rxs_reg_13_0/d
    pathA/path_X/path_Y/path_Z/path_D1/calc_lcode_rxs_reg_6_0/d
    pathA/path_X/path_Y/path_Z/path_D1/lfc_out_rxs_reg_1_0/d
    pathA/path_X/path_Y/path_Z/path_D1/lfc_out_rxs_reg_2_0/d
    pathA/path_X/path_Y/path_Z/path_D1/lfc_out_rxs_reg_4_0/d
    pathA/path_X/path__AA/path__AB/push_byte_offset_rxs_reg_2_0/d
    pathA/path_X/path__AA/path__AB/secure_push_data_rxs_reg_52_0/d
    pathA/path_X/path__AA/path__AB/ssm_ath1_push_data_rxs_reg_52_0/d
    pathA/path_X/path__AA/path__AB/ssm_pke3_push_data_rxs_reg_26_0/d
    pathA/path_X/path__AA/path__AB/ssm_pke4_push_data_rxs_reg_26_0/d


    So what I want to do is to create a filter for the strings - but ignoring the numeric bits at the end i.e.
    reg_[0-9]+_+[0-9]+/d

    The strings are obviiously of varying length and the number of hierarchical paths is different,
    so I can't split string on "/".
    Similarly folder paths names can contain "_" so can't split string on this either.
    As I don't know how many "/" or "-" instances there will be in the string I don't believe I can use the find function.
    Also as the amount of number will be different i don't think I can use =right(a1,X) either.

    I may be able to search for the pattern above - as this is probabay unique - so maybe it's something like the following pseudo code:

    Function GetString(txt As String) As String
    With CreateObject("VBScript.RegExp")
    .Pattern = "reg_\d+(_)+\d+//d"
    GetString = .execute(txt)(0)
    End With
    End Function

    If I do require VBA code - how do I then use this for creating a column filter?
    Or will I have to extract the filtered data first from the column (and its associated row data) into another worksheet to use?

    Once I have the filter in place I want to create tables using the filtered data -
    so for example each column value above has a lot of associated data values in each row e.g

    26 pathA/path_123/path_456/data_out_reg_0_0/d
    32 pathA/path_123/path_456/data_out_reg_17_0/d
    8 pathA/path_123/path_456/data_out_reg_4_0/d

    So my table would show the name "data_out_reg" and the range of values 8-32.

    Anyway hopefully someone can help with the first part

    Thanks

    Neil

  2. #2
    Forum Contributor Portuga's Avatar
    Join Date
    02-20-2004
    Location
    Portugal
    MS-Off Ver
    365
    Posts
    852

    Re: How to filter / sort data column using only partial match - regexp

    You could use the auto filter "Contains" option to get what you need.

    You can also use VBA to get it (same thing really) and then have buttons with the macros assigned to them.

    Once you click the button, then the filter would change the data accordingly.

    Considering the data would be in column A with A1 as a title here are 2 pieces of code you could adapt.

    Please Login or Register  to view this content.
    Please Login or Register  to view this content.
    If you found the solution to your question. Mark the thread as "Solved"
    Thank everyone that helped you with a valid solution by clicking on their

    There is no such thing as a problem, only a temporary lack of a solution

  3. #3
    Registered User
    Join Date
    04-16-2013
    Location
    Ireland
    MS-Off Ver
    Excel 2010
    Posts
    4

    Re: How to filter / sort data column using only partial match - regexp

    Hi,

    I can see how this would work for a small subset of data - where I would create a new sub-filter to make a partial match for each unique hierarchical path.
    But I have literally thousands of rows of data.

    I was hoping there was a more automated method that could leverage the fact that the data to be "ignored",
    at the end of each cell data, has a defined pattern - suitable for regexp?

    Neil

  4. #4
    Forum Contributor Portuga's Avatar
    Join Date
    02-20-2004
    Location
    Portugal
    MS-Off Ver
    365
    Posts
    852

    Re: How to filter / sort data column using only partial match - regexp

    You are looking for a defined pattern on data that shows, from your description, no global pattern.

    It seems that there is a pattern in some subsets of data and I´m guessing that would be the only way to get progress.

    It also seems that is what you are looking for the table info, and are defining a very specific subset target when you mentioned: "data_out_reg".

    You could have both steps into 1 (Filter + tables) by creating pivot tables from your data and then use the filter approach mentioned above, since pivot tables also have filters with "Contains" capability

  5. #5
    Forum Contributor Portuga's Avatar
    Join Date
    02-20-2004
    Location
    Portugal
    MS-Off Ver
    365
    Posts
    852

    Re: How to filter / sort data column using only partial match - regexp

    Unless "reg" is a global delimiter and you want anything left of it, in which case you could use the following to get that string:

    =LEFT(A1;FIND("reg";A1)-1)

  6. #6
    Registered User
    Join Date
    04-16-2013
    Location
    Ireland
    MS-Off Ver
    Excel 2010
    Posts
    4

    Re: How to filter / sort data column using only partial match - regexp

    Hi,

    yes that's working for a large number of the paths. Thanks you very much!

    =LEFT(A1;FIND("reg";A1)-1)

    However I need to expand the searches for other similar options

    e.g. ignore characters after: _reg_ OR [ OR shiftflop_inst_nocg_


    =IFERROR(LEFT(C2,FIND("_reg_",C2)-1),IFERROR(LEFT(C2,FIND("[",C2)-1),IFERROR(LEFT(C2,FIND("/shiftflop_inst_nocg_",C2)-1),C2)))



    This works for the first 2 cases - but NOT the last.
    If I remove the first 2 and ONLY include the last it works fine!

    =IFERROR(LEFT(C2,FIND("/shiftflop_inst_nocg_",C2)-1),C2)))


    Am I limited on the number of nested LEFT/FIND statements I can use?
    Is there a better option?
    I was trying to use the OR function - but didn't have any success.

    Thanks

+ Reply to Thread

Thread Information

Users Browsing this Thread

There are currently 1 users browsing this thread. (0 members and 1 guests)

Bookmarks

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts

Search Engine Friendly URLs by vBSEO 3.6.0 RC 1