+ Reply to Thread
Results 1 to 4 of 4

Statistical functions: count but disregard duplicates

  1. #1
    Registered User
    Join Date
    04-19-2009
    Location
    Toronto
    MS-Off Ver
    Excel 2007
    Posts
    1

    Post Statistical functions: count but disregard duplicates

    Hi,

    Long-time Excel user but fairly new to the statistical functions. Have a bit of a dilemma that I hope someone can help me with.

    I have a spreadsheet that consists of several hundred (will eventually be about 10,000) companies and the types of products they manufacture. It is sorted by customer number, which is unique for each company. Most companies have more than one line in the spreadsheet (many may have up to 30 lines), because they manufacture the same type of product under more than one brand name AND/OR because they manufacture two or more different types of products.

    Simplified example:

    Customer Number Brand Type
    1 A VACUUM CLEANERS
    1 A TOASTERS
    1 B VACUUM CLEANERS
    1 C AIR CONDITIONERS
    2 D TOASTERS
    3 E AIR CONDITIONERS
    4 F AIR CONDITIONERS
    4 F TOASTERS
    4 G TOASTERS

    and so on.

    What I would like to do is figure out how many of these companies, of the total number, manufacture a given product type, like toasters for example. If I do a straight COUNTIF on the type column, it will return 4 which is not correct. 3 of the 4 companies manufacture toasters, but since it's listed twice under company 4 (due to two different brand names) that will throw off the numbers. I need it to screen out this type of duplicate entry and return a value of 3.

    I guess what I need is a formula that will assess all of company #1's entries, if "TOASTERS" is listed it counts as true, if there's no listing it counts as false, then it moves on to company #2's entry and so on.

    I have looked all through Excel help and all over the Internet with no luck. Please help!

    Thanks.

  2. #2
    Banned User!
    Join Date
    10-14-2006
    Posts
    1,211

    Re: Statistical functions: count but disregard duplicates

    Quote Originally Posted by voodoomusic View Post
    Hi,

    Long-time Excel user but fairly new to the statistical functions. Have a bit of a dilemma that I hope someone can help me with.

    I have a spreadsheet that consists of several hundred (will eventually be about 10,000) companies and the types of products they manufacture. It is sorted by customer number, which is unique for each company. Most companies have more than one line in the spreadsheet (many may have up to 30 lines), because they manufacture the same type of product under more than one brand name AND/OR because they manufacture two or more different types of products.

    Simplified example:

    Customer Number Brand Type
    1 A VACUUM CLEANERS
    1 A TOASTERS
    1 B VACUUM CLEANERS
    1 C AIR CONDITIONERS
    2 D TOASTERS
    3 E AIR CONDITIONERS
    4 F AIR CONDITIONERS
    4 F TOASTERS
    4 G TOASTERS

    and so on.

    What I would like to do is figure out how many of these companies, of the total number, manufacture a given product type, like toasters for example. If I do a straight COUNTIF on the type column, it will return 4 which is not correct. 3 of the 4 companies manufacture toasters, but since it's listed twice under company 4 (due to two different brand names) that will throw off the numbers. I need it to screen out this type of duplicate entry and return a value of 3.

    I guess what I need is a formula that will assess all of company #1's entries, if "TOASTERS" is listed it counts as true, if there's no listing it counts as false, then it moves on to company #2's entry and so on.

    I have looked all through Excel help and all over the Internet with no luck. Please help!

    Thanks.
    =SUM(N(FREQUENCY(IF(C2:C10="Toasters",MATCH(A2:A10,A2:A10,0)),MATCH(A2:A10,A2:A10,0))>0))

    ctrl+shift+enter, not just enter

  3. #3
    Forum Expert JBeaucaire's Avatar
    Join Date
    03-21-2004
    Location
    Bakersfield, CA
    MS-Off Ver
    2010, 2016, Office 365
    Posts
    33,492

    Re: Statistical functions: count but disregard duplicates

    Maybe like this, I always use helper columns to help me sort through duplicates in realtime like this...and with 10,000 rows of data coming, you should avoid any array formulas. This will scale up nicely.
    Attached Files Attached Files
    _________________
    Microsoft MVP 2010 - Excel
    Visit: Jerry Beaucaire's Excel Files & Macros

    If you've been given good help, use the icon below to give reputation feedback, it is appreciated.
    Always put your code between code tags. [CODE] your code here [/CODE]

    ?None of us is as good as all of us? - Ray Kroc
    ?Actually, I *am* a rocket scientist.? - JB (little ones count!)

  4. #4
    Forum Expert daddylonglegs's Avatar
    Join Date
    01-14-2006
    Location
    England
    MS-Off Ver
    Microsoft 365
    Posts
    14,677

    Re: Statistical functions: count but disregard duplicates

    I tend to agree with JB, Teethless Mama's suggested formula works great for small ranges but the MATCH function particularly across 10000 cells is going to be slow.

    Here's another alternative using Database function DCOUNTA.

    I put "Type" in E1 (this has to match the header in C1) and E2 holds the Product you want to count, e.g. TOASTERS.

    F1 has a header which isn't the same as any of the headers in A1:C1, I chose "Unique" and under that in F2 this formula

    =SUMPRODUCT(--(A$2:A2=A2),--(C$2:C2=C2))=1

    [Note this cell will dispaly TRUE or FALSE, it doesn't matter which]

    Now for the unique count in E4 use this formula

    =DCOUNTA(A1:C10000,1,E1:F2)

    just change E2 to whatever you want to count......see attached
    Attached Files Attached Files

+ Reply to Thread

Thread Information

Users Browsing this Thread

There are currently 1 users browsing this thread. (0 members and 1 guests)

Bookmarks

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts

Search Engine Friendly URLs by vBSEO 3.6.0 RC 1