+ Reply to Thread
Results 1 to 11 of 11

Formula for detecting inexact duplicates?

Hybrid View

  1. #1
    Registered User
    Join Date
    06-15-2012
    Location
    Montreal, QC
    MS-Off Ver
    Excel 2003
    Posts
    5

    Angry Formula for detecting inexact duplicates?

    Hi there,

    I have a relatively large Excel 2003 file (2 worksheets of ~5000 companies) that needs to be uploaded to a Microsoft Dynamics database. I want to upload it without any duplicates, and that includes any companies that are very similar to pre-existing companies in the MD database. For example, I might have a company called "Microsoft, Inc" in my upload file, and the existing file in the database is called "Microsoft Inc." which is very similar but is off by 2 characters, which would create a duplicate. Is there a formula that I can apply to my data that will flag entries that are off by say, 2-5 characters? Any ideas?

    Thanks!

  2. #2
    Forum Guru TMS's Avatar
    Join Date
    07-15-2010
    Location
    The Great City of Manchester, NW England ;-)
    MS-Off Ver
    MSO 2007,2010,365
    Posts
    48,077

    Re: Formula for detecting inexact duplicates?

    I've never used it but it might be what you want ... although this refers to Excel 2010.

    Or you can Google it ... http://bit.ly/MvWIh2

    Regards, TMS
    Trevor Shuttleworth - Retired Excel/VBA Consultant

    I dream of a better world where chickens can cross the road without having their motives questioned

    'Being unapologetic means never having to say you're sorry' John Cooper Clarke


  3. #3
    Registered User
    Join Date
    06-15-2012
    Location
    Montreal, QC
    MS-Off Ver
    Excel 2003
    Posts
    5

    Re: Formula for detecting inexact duplicates?

    Thanks, that got me on the right track at least.

  4. #4
    Forum Guru TMS's Avatar
    Join Date
    07-15-2010
    Location
    The Great City of Manchester, NW England ;-)
    MS-Off Ver
    MSO 2007,2010,365
    Posts
    48,077

    Re: Formula for detecting inexact duplicates?

    You're welcome.

  5. #5
    Administrator FDibbins's Avatar
    Join Date
    12-29-2011
    Location
    Duncansville, PA USA
    MS-Off Ver
    Excel 7/10/13/16/365 (PC ver 2310)
    Posts
    53,047

    Re: Formula for detecting inexact duplicates?

    could you do a match on the 1st 4 characters, then 1st 5 etc, to narrow your list down? you could put the number of characters in a "reference" cell, and then just change that. build the formula to return just the matches, then add a filter to only show the matches?
    1. Use code tags for VBA. [code] Your Code [/code] (or use the # button)
    2. If your question is resolved, mark it SOLVED using the thread tools
    3. Click on the star if you think someone helped you

    Regards
    Ford

  6. #6
    Registered User
    Join Date
    06-15-2012
    Location
    Montreal, QC
    MS-Off Ver
    Excel 2003
    Posts
    5

    Re: Formula for detecting inexact duplicates?

    That sounds like a good idea, but I'm not sure how to do a match on a certain number of characters. Could you give an example please?

  7. #7
    Forum Expert Cutter's Avatar
    Join Date
    05-24-2004
    Location
    Ontario,Canada
    MS-Off Ver
    Excel 2010
    Posts
    6,451

    Re: Formula for detecting inexact duplicates?

    Like this:

    =LEFT(A1,5)=LEFT(B1,5) will return TRUE if the first 5 characters match and FALSE if they don't

  8. #8
    Registered User
    Join Date
    06-15-2012
    Location
    Montreal, QC
    MS-Off Ver
    Excel 2003
    Posts
    5

    Re: Formula for detecting inexact duplicates?

    Thanks for your reply, but wouldn't that mean that the company names in columns A and B would have to line up? I don't necessarily have the same number of companies in both lists, so they wouldn't line up one to one.

  9. #9
    Forum Expert Cutter's Avatar
    Join Date
    05-24-2004
    Location
    Ontario,Canada
    MS-Off Ver
    Excel 2010
    Posts
    6,451

    Re: Formula for detecting inexact duplicates?

    OK, then like this:

    =MATCH(LEFT(B2,C1)&"*",A:A,0)

    this will produce a numeric result for a match of the first x number of characters of the value of B2 match the same first x number of characters of a value in column A - the value of x is in C1 - and it will produce #N/A where no such match is found

  10. #10
    Registered User
    Join Date
    06-15-2012
    Location
    Montreal, QC
    MS-Off Ver
    Excel 2003
    Posts
    5

    Re: Formula for detecting inexact duplicates?

    Thanks again. I'm having trouble running the formula though, even though it seems like I'm right on the verge of getting it. Let's say my array is column A, and value I'm trying to find is B2, B3 and so on. Where does the C1 in your formula come into play? And the &"*" ?

  11. #11
    Administrator FDibbins's Avatar
    Join Date
    12-29-2011
    Location
    Duncansville, PA USA
    MS-Off Ver
    Excel 7/10/13/16/365 (PC ver 2310)
    Posts
    53,047

    Re: Formula for detecting inexact duplicates?

    the B2 is the text you want to search in, and the C1 is the number of characters you want to return (4 or 5 or 6 etc)

    the "*" is a wildcard in the match

+ Reply to Thread

Thread Information

Users Browsing this Thread

There are currently 1 users browsing this thread. (0 members and 1 guests)

Bookmarks

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts

Search Engine Friendly URLs by vBSEO 3.6.0 RC 1