Results 1 to 3 of 3

Complex Duplicate Removal

Threaded View

bluescreenofdeath Complex Duplicate Removal 12-13-2012, 12:51 AM
bluescreenofdeath Re: Complex Duplicate Removal 12-14-2012, 10:47 AM
bluescreenofdeath Re: Complex Duplicate Removal 12-17-2012, 07:28 PM
  1. #1
    Registered User
    Join Date
    12-12-2012
    Location
    Atlanta, Georgia
    MS-Off Ver
    Excel 2010
    Posts
    3

    Complex Duplicate Removal

    Hi guys, new to the forum and I have a rather complex question / problem that I really need help with. The rate that I’m learning VBA coding is not going nearly fast enough to handle this myself! ☹

    I have a spreadsheet with a list of clients collected from various state sites, but the formatting of the information is fairly random at best. Now I’ve manually parsed the combined lists, but as we keep them updated in the future, we need a simpler way to essentially check for duplicates.

    In Sheet1, we have the manually combined list with the following column headers: Client Name, DBA Name 1, DBA Name 2, DBA Name 3, Address_Line1, Address_Line2, Address_City, Address_State, Address_Zip, phone number, Website URL, email 1, email 2, email 3, email 4, email 5. All the records are complete in Sheet1.

    Now in Sheet2, we’ll paste the partially complete state records in the same field format as Sheet 1. The main problem for us in simply doing a “Remove Duplicates” is that often the states don’t require the same formatting for the LLC, LP, LTD, etc. So in Sheet1, client names (and the DBA fields) will read like “ABC, LLC”, but many state agencies will have the names be “ABC LLC” or just “ABC”, so when we try to remove duplicates, it doesn’t realize they’re the same. In addition to this, client “ABC, LLC” might have a DBA Name 1 as “ABC1, LLC” , so it doesn’t remove a newly added entry if “ABC1, LLC” is in the Client Name field.

    Ideally, this is what we need: We have a complete database in Sheet1, and we need to remove the duplicates in Sheet2 using Client Name, DBA Name 1, DBA Name 2, DBA Name 3, Address_City, and Address_State. We need to check the client name in sheet 2 against the DBA names we have in Sheet1 and treat those as duplicates as well. The non-duplicates in Sheet2, it will copy to Sheet 3 (or even just delete all the duplicates in Sheet2 and leave the non-duplicates there) with the correct “ABC, LLC” formatting. Client Name almost has to be a bit of a fuzzy match in order to work (I’m assuming), and we’re just at wits end on trying to get this figured out. Manually parsing the lists took us over 2 weeks to do.

    I attached an example file of what we’re dealing with (no info is real).

    THANK YOU in advance!
    Attached Files Attached Files

Thread Information

Users Browsing this Thread

There are currently 1 users browsing this thread. (0 members and 1 guests)

Bookmarks

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts

Search Engine Friendly URLs by vBSEO 3.6.0 RC 1