+ Reply to Thread
Results 1 to 4 of 4

Estimating total usage incidence from samples

  1. #1
    Registered User
    Join Date
    01-16-2013
    Location
    Michigan, U.S.
    MS-Off Ver
    Excel 2007
    Posts
    2

    Question Estimating total usage incidence from samples

    I’m trying to use Excel to estimate total unique visitors and total 1, 2 and 3+ time users from three (non-exclusive) samples of varying size and %s. Then, to estimate the same values from all three samples combined – an example follows:

    Group #1:
    Total count: 300,000
    Sample available: 30,000
    Sample Size % of Total: 10.0%

    Similarly, groups 2 & 3 lay out the same way:
    Group 2:
    5,000 (total)
    2,000 (sample)
    40.0% (sample %)

    Group 3:
    40,000
    10,000
    25.0%

    Total
    345,000
    42,000
    12.2%

    Next, I use individual names to count duplicates within each group to get numbers like the following:

    Group 1:
    Uniques: 24,000
    Single Use Sample: 20,000
    Dual Use Sample: 3,000
    3+ Use Sample: 1,000

    …using the same layout for other groups:
    Group 2:
    1,150 (uniques)
    1,000 (1x)
    100 (2x)
    50 (3+x)

    Group 3:
    7,100
    6,000
    1,000
    100

    Adding across the groups, provides an incorrect “additive total” as follows:

    “Additive Total”:
    32,250
    27,000
    4,100
    1,150

    If I actually combine the raw, individual name data from all three groups, I might get true, deduped total numbers like:
    30,900
    25,000
    4,400
    1,500

    You’ll notice that the increased incidence of multi-use persons (versus the “additive total”) when combining the three different groups as some of the same users exist across the different groups. This, in turn, reduces the one-time use people and uniques relative to the “additive total” figures.

    I point this out because the next step, and ultimate objective, is to estimate the uniques and single, dual and 3+ use for each of the groups individually and in total: Here again, I can’t use straight-line math to get to a total as when the quantities of users increase, the uniques and single-use persons will be reduced versus a linearly-estimated total.

    As an example: I can’t take 24,000 uniques in the Group 1 sample and divide by the 10% sample size to get an estimated 240,000 total uniques as many of the sampled uniques would show up again if we actually had the full roster of 300,000 Group 1 people; the real uniques value, from the full data set, might be 180,000 or 210,000.

    Similarly, when combining the samples across groups, I can’t apply the 12% factor to the combined total sample uniques of 30,900 to get an estimated total of 253,821 as the multi-use counts would increase, decreasing the uniques and one-time use individuals.

    So that was a long, but (hopefully) illustrative, way of getting to the question of “What’s an Excel formula that can be applied to the individual sample groups, and to the combined sample totals, to estimate the full data set values?"

    It's worth noting that sample sizes might vary from 10-100% in the groups and across different sets of other groups of users that I need to compare to. The bottom line is I need estimates for this set of groups, and several other varied sets of groups so I can compare the absolute, estimated uniques, 1, 2 and 3+ numbers across the sets.

    Thanks for your help!

  2. #2
    Forum Expert dilipandey's Avatar
    Join Date
    12-05-2011
    Location
    Dubai, UAE
    MS-Off Ver
    1997 - 2016
    Posts
    8,191

    Re: Estimating total usage incidence from samples

    Hi Saga,

    welcome to the forum.

    Very detailed description.. but no response from forum members.. would suggest you to include a sample excel workbook with your expected results. thanks.


    Regards,
    DILIPandey
    <click on below * if this helps>
    DILIPandey, Excel rMVP
    +919810929744 (India), +971528225509 (Dubai), dilipandey@gmail.com

  3. #3
    Registered User
    Join Date
    01-16-2013
    Location
    Michigan, U.S.
    MS-Off Ver
    Excel 2007
    Posts
    2

    Re: Estimating total usage incidence from samples

    Estimating usage math puzzle.xlsx

    Attached are the summary numbers in a file form so you can see the build / relationships.

  4. #4
    Forum Expert dilipandey's Avatar
    Join Date
    12-05-2011
    Location
    Dubai, UAE
    MS-Off Ver
    1997 - 2016
    Posts
    8,191

    Re: Estimating total usage incidence from samples

    Hi Saga, I believe you need results in yellow cells (G16:G19).. and the results which are there, are not correct. Can you explain what number you wish to see there - may be just one of them. Thanks.



    Regards,
    DILIPandey
    <click on below * if this helps>

+ Reply to Thread

Thread Information

Users Browsing this Thread

There are currently 1 users browsing this thread. (0 members and 1 guests)

Tags for this Thread

Bookmarks

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts

Search Engine Friendly URLs by vBSEO 3.6.0 RC 1