+ Reply to Thread
Results 1 to 16 of 16

Sampling Without Repetition

  1. #1
    Registered User
    Join Date
    09-10-2010
    Location
    USA
    MS-Off Ver
    Excel 2007
    Posts
    9

    Sampling Without Repetition

    Hello,
    I have a set of data which includes 10 samples each having 775 possible data points. I want to see what is shared between these samples when I go down to 9, 8, 7, 6... n samples. I am hoping that I can do this with Visual Basic, or other Excel functions where I could essentially determine the average of this similarity after calculating what is shared between all possible combinations given the sample total I am on. For example if I want to see what is shared across 9 out of the 10 samples, then I would need to determine the average of this shared-ness between all possible combinations of 9. I would upload a dataset but they are too large. Let's say in A175 I have an ID and in columns A2 through A11 are the values corresponding to that ID for each sample, I want to take all possible combinations of n samples for this value. I apologize if this doesn't provide enough detail or is vague, I am relatively new to this type of analysis and am trying to teach myself in a limited timeframe. Thank you in advance.

    -Provoxt

  2. #2
    Forum Expert shg's Avatar
    Join Date
    06-20-2007
    Location
    The Great State of Texas
    MS-Off Ver
    2010, 2019
    Posts
    40,689

    re: Sampling Without Repetition

    Welcome to the forum.

    That is indeed pretty vague.

    Whatever it is you're trying to do, it can be done with Excel, unless you're talking about a very large number of combinations to evaluate.
    Last edited by shg; 09-10-2010 at 12:02 PM.
    Entia non sunt multiplicanda sine necessitate

  3. #3
    Registered User
    Join Date
    09-10-2010
    Location
    USA
    MS-Off Ver
    Excel 2007
    Posts
    9

    Re: Sampling Without Repetition

    My apologies for the vagueness.

    I have a 775 x 11 table basically. Each row consists of an ID followed by specific values for each column header (sample). I am trying to determine how well values are shared between all of these samples. The way I have to do that is by looking at what is shared between all 10 samples, which I can easily do with =Countif(), and seeing what all comes up with a count of 10 in each row. It gets harder to do as I start getting rid of 1 sample to only look at 9, in which case I have to look at every combination of 9 samples (1-2-3-4-5-6-7-8-9/1-2-3-4-5-6-7-8-10, etc.) and then continue to do this with combinations of 8 samples, 7 samples, and so on. I have used Visual Basic before, but it was EXTREMELY introductory. I want to try and write a script that would assess all of the possible combinations without repeating any, total up each row and then either take the average, or output each total so I can average them myself. I hope this helps, again please accept my apologies for lack of knowledge/understanding.

    -Provoxt

  4. #4
    Registered User
    Join Date
    09-10-2010
    Location
    USA
    MS-Off Ver
    Excel 2007
    Posts
    9

    Re: Sampling Without Repetition

    Maybe this will help.

    X'ed cells indicate those to be included
    Blacked cells indicate exclusion in combination

    Sum would be sum for that ID at each combination

    Average would hopefully be the average of all Sums
    Attached Files Attached Files

  5. #5
    Forum Expert shg's Avatar
    Join Date
    06-20-2007
    Location
    The Great State of Texas
    MS-Off Ver
    2010, 2019
    Posts
    40,689

    Re: Sampling Without Repetition

    =SUM(B2:K11) - SUMPRODUCT( --(ROW(B2:K11) - ROW(B2) = COLUMN(K11) - COLUMN(B2:K11)), B2:K11)

    ... returns the sum less the diagonal.

    If you want help with Excel, please psot workbooks, not PDFs or pictures or ...

  6. #6
    Registered User
    Join Date
    09-10-2010
    Location
    USA
    MS-Off Ver
    Excel 2007
    Posts
    9

    Re: Sampling Without Repetition

    Again I apologize.

    I have attached an excel sheet this time. I think I need a script to create each possible combination, since the number of them will increase as the sampling size decreases (with 8 as a sampling size the number of combinations becomes 45). I really appreciate the help with this, I hope it's becoming clearer what I am trying/needing to do. Please keep in mind that the attached sheet is only a tiny example of the overall actual table.
    Attached Files Attached Files

  7. #7
    Forum Expert shg's Avatar
    Join Date
    06-20-2007
    Location
    The Great State of Texas
    MS-Off Ver
    2010, 2019
    Posts
    40,689

    Re: Sampling Without Repetition

    I don't get it. Show some examples.

  8. #8
    Registered User
    Join Date
    09-10-2010
    Location
    USA
    MS-Off Ver
    Excel 2007
    Posts
    9

    Re: Sampling Without Repetition

    What I have done is illustrate what happens with a few different IDs. As you can see, in the top rows A occurs 10 times only once and so would be considered a shared value among these samples, but only for that one ID. When I look at 9 samples ID 2 becomes shared over the total number of samples assessed, however when we look at the actual sample combinations we see that it's only shared in one combination. This is what I am looking to write a script for, but instead of my manually arranging each sample to create a combination, I want to automate that, especially given that as I decrease the number of samples to assess the possible combinations increases quite a bit. What I don't know how to do is write the script to pick each of these combinations without repeating. So if there are possible combinations of S1-S10, I only want to look at each combination once. Hope that makes it clearer
    Attached Files Attached Files

  9. #9
    Forum Expert shg's Avatar
    Join Date
    06-20-2007
    Location
    The Great State of Texas
    MS-Off Ver
    2010, 2019
    Posts
    40,689

    Re: Sampling Without Repetition

    See attached.
    Attached Files Attached Files

  10. #10
    Registered User
    Join Date
    09-10-2010
    Location
    USA
    MS-Off Ver
    Excel 2007
    Posts
    9

    Smile Re: Sampling Without Repetition

    I am really intrigued by your reply, could you explain how the functions actually work? Are there any good sources out there for understanding Excel functions?

  11. #11
    Forum Expert shg's Avatar
    Join Date
    06-20-2007
    Location
    The Great State of Texas
    MS-Off Ver
    2010, 2019
    Posts
    40,689

    Re: Sampling Without Repetition

    I only added one equation to the workbook in C21,, and then copied it:

    =INDEX($C$2:$L$6, $A21, COLUMNS($C21:C21) + (COLUMNS($A21:C21) > 10 - $B21 + 2) )

    $C$2:$L$6 are the table extents

    $A21 (ID) chooses the row from the table

    COLUMNS($C21:C21) chooses the same column from the table that the formula is in

    + (COLUMNS($A21:C21) > 10 - $B21 + 2) adds one to the column if it's on or after the excluded diagonal.

  12. #12
    Registered User
    Join Date
    09-10-2010
    Location
    USA
    MS-Off Ver
    Excel 2007
    Posts
    9

    Re: Sampling Without Repetition

    I follow everything but the last line... and how do I take into account the exclusion of more cells as I decrease the sampling group?

  13. #13
    Forum Expert shg's Avatar
    Join Date
    06-20-2007
    Location
    The Great State of Texas
    MS-Off Ver
    2010, 2019
    Posts
    40,689

    Re: Sampling Without Repetition

    I don't follow that.

  14. #14
    Registered User
    Join Date
    09-10-2010
    Location
    USA
    MS-Off Ver
    Excel 2007
    Posts
    9

    Re: Sampling Without Repetition

    As I decrease the sampling groups from 10 to 9 to 8 etc. how do I indicate that? Because that will lead to an increase in excluded cells and result in more combinations. Unless I am not understanding this enough and have overlooked what to do..

  15. #15
    Forum Expert shg's Avatar
    Join Date
    06-20-2007
    Location
    The Great State of Texas
    MS-Off Ver
    2010, 2019
    Posts
    40,689

    Re: Sampling Without Repetition

    Provoxt, you have a clear picture of what you're trying to do; I have none at all, including how any of this relates to sampling.

  16. #16
    Registered User
    Join Date
    09-10-2010
    Location
    USA
    MS-Off Ver
    Excel 2007
    Posts
    9

    Re: Sampling Without Repetition

    I am looking at a dataset taken from 10 samples. I have noticed that there is a set of tested IDs that have the same value in all 10 of my samples, and I want to see how this similarity changes by decreasing the number of samples, but in order to accurately do this I have to look at all possible combination of samples with respect to the total number of samples at each level. I believe it's similar to rarefaction, but I am not completely sure. What you have given me is potentially everything I need, but I don't understand what each component in the third line of your function explanation is responsible for (i appreciate the overall explanation), and thus not sure how I would need to edit this in order to take into account a lower overall group and more sample combinations.

    For instance, when exclude two samples at a time, the total possible combination of samples will increase alot, but I don't know how to account for that, so I can accurately determine total occurences of this value. I really do apologize if I am not giving enough detail, I appreciate the help you have given me.
    Last edited by Provoxt; 09-13-2010 at 10:23 AM. Reason: Typo

+ Reply to Thread

Thread Information

Users Browsing this Thread

There are currently 1 users browsing this thread. (0 members and 1 guests)

Bookmarks

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts

Search Engine Friendly URLs by vBSEO 3.6.0 RC 1