+ Reply to Thread
Results 1 to 9 of 9

Delete Duplicates based on multiple criteria

Hybrid View

  1. #1
    Registered User
    Join Date
    04-07-2022
    Location
    Kentucky
    MS-Off Ver
    Excel 365
    Posts
    32

    Delete Duplicates based on multiple criteria

    Attached is a sample workbook that needs to be cleaned up. I need to remove duplicates based on multiple criteria. The criteria are Project Number, Invoice Business Unit and Total. Where it confuses me is that I need to compare these records against each other and delete duplicates on the following conditions.

    Does the Project Number Match?
    Does the Invoice Business Unit Match?
    If the record meets both conditions and the total is zero and is the only one, it stays
    If the record meets both conditions and the total is zero and is not the only one, all duplicate records go
    If the Project Number matches but the Invoice Business Unit doesn't match, then it stays
    If nothing matches, it stays

    I hope the example explains this better than what I've typed above. Everything staying is in green, and everything that would be deleted is in yellow. Which zero dollar record that gets deleted is irrelevant as long as one remains from the matching conditions.

    Any help or insight would be greatly appreciated.
    Attached Files Attached Files

  2. #2
    Banned User!
    Join Date
    02-05-2015
    Location
    San Escobar
    MS-Off Ver
    any on PC except 365
    Posts
    12,168

    Cool Re: Delete Duplicates based on multiple criteria

    is that what you want?

    Total Invoice Business Unit Project Number Total Invoice Business Unit Project Number Index
    $0.00
    Department A Customer 1234
    1
    0
    Department A Customer 1234
    1
    $0.00
    Department A Customer 1234
    2
    0
    Department B Customer 1234
    4
    $0.00
    Department A Customer 1234
    3
    0
    Department B Customer 2345
    6
    $0.00
    Department B Customer 1234
    4
    89
    Department B Customer 2345
    7
    $0.00
    Department A Customer 1234
    5
    89
    Department C Customer 0812
    8
    $0.00
    Department B Customer 2345
    6
    0
    Department D Customer 0812
    9
    $89.00
    Department B Customer 2345
    7
    0
    Department C Customer 0812
    10
    $89.00
    Department C Customer 0812
    8
    $0.00
    Department D Customer 0812
    9
    $0.00
    Department C Customer 0812
    10
    $0.00
    Department D Customer 0812
    11

    Ijndex is for visualization only

    // Table1
    let
        Source = Excel.CurrentWorkbook(){[Name="Table1"]}[Content],
        Type = Table.TransformColumnTypes(Source,{{"Total", Int64.Type}, {"Invoice Business Unit", type text}, {"Project Number", type text}}),
        Idx = Table.AddIndexColumn(Type, "Index", 1, 1),
        Distinct = Table.Distinct(Idx, Table.ColumnNames(Type))
    in
        Distinct
    Last edited by sandy666; 05-19-2023 at 12:34 PM.

  3. #3
    Forum Guru HansDouwe's Avatar
    Join Date
    06-21-2022
    Location
    Nederland
    MS-Off Ver
    365 V2403 (Build 17330.20000)
    Posts
    6,466

    Re: Delete Duplicates based on multiple criteria

    Is =UNIQUE(Table1) what you want?

    See attachment.
    Attached Files Attached Files

  4. #4
    Registered User
    Join Date
    04-07-2022
    Location
    Kentucky
    MS-Off Ver
    Excel 365
    Posts
    32

    Re: Delete Duplicates based on multiple criteria

    Hans,

    This provides data similar to Sandy's solution. Close but not exact. Thank you for replying.

  5. #5
    Registered User
    Join Date
    04-07-2022
    Location
    Kentucky
    MS-Off Ver
    Excel 365
    Posts
    32

    Re: Delete Duplicates based on multiple criteria

    Sandy,

    That is close, but for Customer 2345, I only need the one record that has money on it. The second Customer record that matches Department B and has no money would need to go. All records with money on it would also need to be kept.

  6. #6
    Banned User!
    Join Date
    02-05-2015
    Location
    San Escobar
    MS-Off Ver
    any on PC except 365
    Posts
    12,168

    Re: Delete Duplicates based on multiple criteria

    Quote Originally Posted by ExcelDavid4 View Post
    Sandy,
    That is close, but for Customer 2345, I only need the one record that has money on it. The second Customer record that matches Department B and has no money would need to go. All records with money on it would also need to be kept.
    like this?
    Total Invoice Business Unit Project Number Total Invoice Business Unit Project Number Index
    $0.00
    Department A Customer 1234
    1
    0
    Department A Customer 1234
    1
    $0.00
    Department A Customer 1234
    2
    0
    Department B Customer 1234
    4
    $0.00
    Department A Customer 1234
    3
    89
    Department B Customer 2345
    7
    $0.00
    Department B Customer 1234
    4
    89
    Department C Customer 0812
    8
    $0.00
    Department A Customer 1234
    5
    0
    Department D Customer 0812
    9
    $0.00
    Department B Customer 2345
    6
    $89.00
    Department B Customer 2345
    7
    $89.00
    Department C Customer 0812
    8
    $0.00
    Department D Customer 0812
    9
    $0.00
    Department C Customer 0812
    10
    $0.00
    Department D Customer 0812
    11

    or post example of proper result
    Last edited by sandy666; 05-19-2023 at 01:39 PM.

  7. #7
    Registered User
    Join Date
    04-07-2022
    Location
    Kentucky
    MS-Off Ver
    Excel 365
    Posts
    32

    Re: Delete Duplicates based on multiple criteria

    Sandy,

    Yes that's it! How did you accomplish that?

  8. #8
    Banned User!
    Join Date
    02-05-2015
    Location
    San Escobar
    MS-Off Ver
    any on PC except 365
    Posts
    12,168

    Cool Re: Delete Duplicates based on multiple criteria

    Just Power Query

    // Table1
    let
        Source = Excel.CurrentWorkbook(){[Name="Table1"]}[Content],
        Type = Table.TransformColumnTypes(Source,{{"Total", Int64.Type}, {"Invoice Business Unit", type text}, {"Project Number", type text}}),
        Idx = Table.AddIndexColumn(Type, "Index", 1, 1),
        Distinct = Table.Distinct(Idx, Table.ColumnNames(Type)),
        Asc = Table.Sort(Distinct,{{"Total", Order.Ascending}}),
        DDistinct = Table.Distinct(Asc, {"Total", "Invoice Business Unit"}),
        DDDistinct = Table.Distinct(DDistinct, {"Invoice Business Unit", "Project Number"}),
        TSRAsc = Table.Sort(DDDistinct,{{"Index", Order.Ascending}}),
        RC = Table.RemoveColumns(TSRAsc,{"Index"})
    in
        RC
    explanation:
    1. let:
      This keyword indicates the start of the Power Query expression.
    2. Source = Excel.CurrentWorkbook(){[Name="Table1"]}[Content]:
      This line assigns the variable Source to the content of a table named "Table1" in the current workbook. It assumes that there is a table named "Table1" in the Excel workbook where the code is running.
    3. Type = Table.TransformColumnTypes(Source,{{"Total", Int64.Type}, {"Invoice Business Unit", type text}, {"Project Number", type text}}):
      This line creates a new variable Type by transforming the column types of the Source table. It specifies that the "Total" column should be of type Int64, and the "Invoice Business Unit" and "Project Number" columns should be of type text.
    4. Idx = Table.AddIndexColumn(Type, "Index", 1, 1):
      This line adds a new column named "Index" to the Type table. The "Index" column contains incremental values starting from 1.
    5. Distinct = Table.Distinct(Idx, Table.ColumnNames(Type)):
      This line creates a new variable Distinct by removing duplicate rows from the Idx table based on all columns of the Type table.
    6. Asc = Table.Sort(Distinct,{{"Total", Order.Ascending}}):
      This line sorts the Distinct table in ascending order based on the "Total" column.
    7. DDistinct = Table.Distinct(Asc, {"Total", "Invoice Business Unit"}):
      This line creates a new variable DDistinct by removing duplicate rows from the Asc table based on the "Total" and "Invoice Business Unit" columns.
    8. DDDistinct = Table.Distinct(DDistinct, {"Invoice Business Unit", "Project Number"}):
      This line further removes duplicate rows from the DDistinct table based on the "Invoice Business Unit" and "Project Number" columns.
    9. TSRAsc = Table.Sort(DDDistinct,{{"Index", Order.Ascending}}):
      This line sorts the DDDistinct table in ascending order based on the "Index" column.
    10. RC = Table.RemoveColumns(TSRAsc,{"Index"}):
      This line creates a new variable RC by removing the "Index" column from the TSRAsc table.
    11. in RC:
      This line indicates the end of the Power Query expression and returns the RC table as the result.

    Overall, the code performs several transformations on the initial data source (Source), including type conversion, removing duplicates, adding an index column, and sorting the data based on specific columns. The final result is a table (RC) with the desired transformations applied.

    Total Invoice Business Unit Project Number
    0
    Department A Customer 1234
    0
    Department B Customer 1234
    89
    Department B Customer 2345
    89
    Department C Customer 0812
    0
    Department D Customer 0812
    Last edited by sandy666; 05-19-2023 at 02:08 PM.

  9. #9
    Valued Forum Contributor
    Join Date
    06-23-2021
    Location
    Romania
    MS-Off Ver
    2021, 365 v 2208
    Posts
    722

    Re: Delete Duplicates based on multiple criteria

    Hi,
    You can do this using sort and remove duplicates
    Attached Images Attached Images
    Diana Tanase


    If the solutions offered helped you to solve your problem, then mark the thread as SOLVED (thread tools in the top menu) and you can click on * to add reputation to those who helped you, as a way to say thank you !

+ Reply to Thread

Thread Information

Users Browsing this Thread

There are currently 1 users browsing this thread. (0 members and 1 guests)

Similar Threads

  1. Replies: 3
    Last Post: 04-17-2020, 04:48 PM
  2. [SOLVED] Find Duplicates in column and then based on row criteria delete the row of that Duplicate
    By bobberuk in forum Excel Programming / VBA / Macros
    Replies: 5
    Last Post: 01-23-2016, 01:56 PM
  3. [SOLVED] Delete Duplicates based on Criteria
    By yuenk in forum Excel Programming / VBA / Macros
    Replies: 3
    Last Post: 12-10-2014, 11:56 AM
  4. Delete Duplicates based on Criteria
    By yuenk in forum Excel Programming / VBA / Macros
    Replies: 4
    Last Post: 12-10-2014, 11:48 AM
  5. [SOLVED] Macro to Delete Duplicates Based on Criteria
    By vcs1161 in forum Excel Programming / VBA / Macros
    Replies: 11
    Last Post: 10-24-2013, 03:19 PM
  6. [SOLVED] Delete duplicates based on criteria
    By chrism_cfu in forum Excel General
    Replies: 6
    Last Post: 12-07-2012, 08:52 AM
  7. Delete Duplicates based on 2 criteria 1st Acutal duplicate 2ND PREVIOUS DATE
    By xcruc1at3r in forum Excel Programming / VBA / Macros
    Replies: 1
    Last Post: 08-24-2011, 09:09 PM

Bookmarks

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts

Search Engine Friendly URLs by vBSEO 3.6.0 RC 1