+ Reply to Thread
Results 1 to 15 of 15

Return value per record (where >1 row per record) based on criteria for multiple columns

  1. #1
    Registered User
    Join Date
    04-14-2013
    Location
    Detroit, MI
    MS-Off Ver
    Excel 2007
    Posts
    10

    Return value per record (where >1 row per record) based on criteria for multiple columns

    Hello.

    I have a data set that looks roughly like this (notice the duplicate records per student, sometimes there are 3+ per student):

    Student Score Date
    1 34 31-Jan
    1 54 26-Jan
    2 65 26-Jan
    2 76 31-Jan
    3 45 3-Feb
    3 67 31-Jan

    I am looking for an automated way (macro or formula) to return the score per student based on the following criteria:
    1. highest score
    and when the scores are the same for each of an individual student's records:
    2. most recent date
    and when the scores and dates are both identical for the same student's records, a 3rd criteria and so on.

    Thanks in advance!!!

  2. #2
    Forum Guru :) Sixthsense :)'s Avatar
    Join Date
    01-01-2012
    Location
    India>Tamilnadu>Chennai
    MS-Off Ver
    2003 To 2010
    Posts
    12,789

    Re: Return value per record (where >1 row per record) based on criteria for multiple colum

    Please attach a sample workbook with expected output for better understanding


    If your problem is solved, then please mark the thread as SOLVED>>Above your first post>>Thread Tools>>
    Mark your thread as Solved


    If the suggestion helps you, then Click *below to Add Reputation

  3. #3
    Registered User
    Join Date
    04-14-2013
    Location
    Detroit, MI
    MS-Off Ver
    Excel 2007
    Posts
    10

    Re: Return value per record (where >1 row per record) based on criteria for multiple colum

    Sorry, attached is a sample input and output with notes for each table.
    The overall logic is for the Excel to look at the scores and take the score for each student unless there is more than one score for a student.
    In that case, take the highest score for that student unless there is more than one of the same highest score for a student.
    In that case, take the most recent of those scores unless they are both on the same day.
    In that case, take the score on that day with the lowest error.

    Thanks in advance!

    Quote Originally Posted by :) Sixthsense :) View Post
    Please attach a sample workbook with expected output for better understanding
    Attached Files Attached Files

  4. #4
    Valued Forum Contributor Lemice's Avatar
    Join Date
    04-13-2013
    Location
    Somewhere.
    MS-Off Ver
    Excel 2016
    Posts
    696

    Re: Return value per record (where >1 row per record) based on criteria for multiple colum

    Hello,

    If you have already sorted, Excel should sort it with priority, like from student 1 (First Name, Last Name), then it will sort Score, Date, then Error.
    On your Sample, all of your choice are the top row of that student's results, so you can make an easy Array formula like this
    =LARGE(IF($A$3:$A$11=$A15,$B$3:$B$11),1)
    When you enter it, hold Ctrl-Shift then hit Enter, because it's an Array formula

    Hope this help.
    (copy pasta from Ford)
    1. Use code tags for VBA. [code] Your Code [/code] (or use the # button)
    2. If your question is resolved, mark it SOLVED using the thread tools

    Regards,
    Lem

  5. #5
    Registered User
    Join Date
    04-14-2013
    Location
    Detroit, MI
    MS-Off Ver
    Excel 2007
    Posts
    10

    Re: Return value per record (where >1 row per record) based on criteria for multiple colum

    Definitely helpful! I was not aware of the large function. I'll see if I can somehow use it with others to do all in one (instead of having to sort first).


    Thanks!

  6. #6
    Registered User
    Join Date
    04-14-2013
    Location
    Detroit, MI
    MS-Off Ver
    Excel 2007
    Posts
    10

    The Plot Thickens: Pull from duplicate records based on multi-column criteria

    Question2.xlsx
    Okay, this is a bit more complex than I thought but I think I have the logic worked out now.
    Given the attached input and output tables, I need a formula that selects a score per student based on the following:

    To select a score:
    if for any student:
    1. there exists only one row (score), then use that one.
    2. Otherwise, if there is >1 row for a student, then
    3. take the score from the row with the "largest" date value (most recent date)
    4. but if the "largest" date value has >1 occurrence for that student, then
    5. take the score from the row with the smallest value of error
    6. but if the smallest value of error has >1 occurrence for that student then
    7. take the score from the row with the highest score

    Thanks!

  7. #7
    Valued Forum Contributor Lemice's Avatar
    Join Date
    04-13-2013
    Location
    Somewhere.
    MS-Off Ver
    Excel 2016
    Posts
    696

    Re: Return value per record (where >1 row per record) based on criteria for multiple colum

    I have a question, it has been bugging me the whole time I'm trying to do this.
    If the student no.1 for example, scored 93 with 4 errors and 95 with 6 errors, both on 4/29/2013, then basing on your priority list, the score of that student will be 93 instead of 95, because both are most recent, and the 93 one has smallest number of errors?

  8. #8
    Valued Forum Contributor Lemice's Avatar
    Join Date
    04-13-2013
    Location
    Somewhere.
    MS-Off Ver
    Excel 2016
    Posts
    696

    Re: Return value per record (where >1 row per record) based on criteria for multiple colum

    If nothing changed, and if student no.1 still get the score 93 in the question above, this might be the formula you are looking for
    Please Login or Register  to view this content.
    If you want to use it, hold Ctrl-Shift then hit Enter after you edit / put it in a cell, because it is an array formula.
    And here is the sample file - Question2.xlsx
    it took me literally 1 hour to figure this out, I feel like learning how to do math all over again ...
    If you want to discuss about how I make this up, I'll gladly share it over.

    Hope this help.

    P/s: You might find some cell with different value than your original file, it's because I changed them intentionally to test out every possibility.
    Last edited by Lemice; 04-16-2013 at 02:01 AM.

  9. #9
    Registered User
    Join Date
    04-14-2013
    Location
    Detroit, MI
    MS-Off Ver
    Excel 2007
    Posts
    10

    Re: Return value per record (where >1 row per record) based on criteria for multiple colum

    Yes, that is correct. By "error," I'm referring to the accuracy of the test score (standard error of measurement).

  10. #10
    Registered User
    Join Date
    04-14-2013
    Location
    Detroit, MI
    MS-Off Ver
    Excel 2007
    Posts
    10

    Re: Return value per record (where >1 row per record) based on criteria for multiple colum

    Wow, thanks!

    Yes, I'd like to understand the logic behind it so that I can modify in the future if needed (please).

    I think I might want to nest this formula in another that refers to the whole table to take the average scaled score based on those criteria.
    Something like =average(the formula you provided without referring to individual rows). Does that make sense?
    To put it differently, the average of all scores that meet the formula criteria.

    Thanks again!

  11. #11
    Valued Forum Contributor Lemice's Avatar
    Join Date
    04-13-2013
    Location
    Somewhere.
    MS-Off Ver
    Excel 2016
    Posts
    696

    Re: Return value per record (where >1 row per record) based on criteria for multiple colum

    Hello,

    This look-complicated formula is actually a series of LARGE and SMALL function. I combined 3 function together, and you can say they search within other's results, like how you eliminating scores using your priority.
    The core of the function, aka the first one to calculate before anything else is your rule number 3
    Please Login or Register  to view this content.
    What this does is identify the most recent date accordingly to the number / name of student in Cell A15
    And I notice something, in the date column, even this most recent date is not absolute - it might not be the most recent date for others, or it is, leading to duplicated results. However, there is something that will make it unique - I add the number / name of student into it, making it a combined string, and it WILL become unique. Think of "April 29", there are lots of rows with that, but "April 29 OF Student 1", only a handful, isn't it?
    So, my idea here is to combine the Array of result (there might be more than 1 test with most recent date) and the unique id of student (name, number, etc) and create a new Array to look for, hence
    Please Login or Register  to view this content.
    If you break it down, it actually means this
    =SMALL(IF("List of ID"&"List of Date"="ID student no.1"&"Most recent Test Date for no.1",1),"Results if match"),1)
    This serves as the function to find the smallest error among all most recent date of a student.
    And again, with the very same idea, there can be more than one result, right? Thus, I added the final touch for it, and here we go
    Please Login or Register  to view this content.
    It actually means this
    =LARGE(IF("List of ID"&"List of Errors"="Student no.1"&"Smallest Errors in Most Recent date"),"Results"),1)
    This function is the final to calculate, and basing on all of the criteria above, it will find the largest score that fits in.
    For your question, I think you can do it, but I suggest making an independent formula to calculate it though, because dealing with Array isn't always ideal.

    Hope this help.
    Last edited by Lemice; 04-16-2013 at 03:44 PM. Reason: forgot / in [code]

  12. #12
    Registered User
    Join Date
    04-14-2013
    Location
    Detroit, MI
    MS-Off Ver
    Excel 2007
    Posts
    10

    Re: Return value per record (where >1 row per record) based on criteria for multiple colum

    Is there any way to avoid using an array formula (it takes forever to computer with thousands of cells) and to use the following order for criteria instead?
    1. lowest SEM (error)
    2. Highest Score
    3. Most Recent Date

    Thanks again,
    Michael

  13. #13
    Valued Forum Contributor Lemice's Avatar
    Join Date
    04-13-2013
    Location
    Somewhere.
    MS-Off Ver
    Excel 2016
    Posts
    696

    Re: Return value per record (where >1 row per record) based on criteria for multiple colum

    Eventually you will want to use Array formula because you are finding a value base on a list of value in random order.

    However, there are things that can make your table run faster:
    - Give the formula Specific Cell Reference, for example A3:A11 is MUCH faster than A:A. This can be done through INDIRECT pointing to a MATCH to find the last row with active cell.
    - Split the formula into smaller one with helper columns.

    If you can provide a sample with about 1000 dummy data (You can make use of RANDBETWEEN here and there), and your condition of how many helper column maximum, etc I will see what I can do. Other than that, you might want to look out for a macro.

  14. #14
    Registered User
    Join Date
    04-14-2013
    Location
    Detroit, MI
    MS-Off Ver
    Excel 2007
    Posts
    10

    Re: Return value per record (where >1 row per record) based on criteria for multiple colum

    Hello again,

    I just used this formula for the first time in months and it saved me hours of eye strain!
    I have yet another request, however: Is it possible to also return the row number of the value used?
    If not, perhaps the standard error and date associated with the chosen score?

    Forever grateful,
    Michael

  15. #15
    Registered User
    Join Date
    04-14-2013
    Location
    Detroit, MI
    MS-Off Ver
    Excel 2007
    Posts
    10

    Re: Return value per record (where >1 row per record) based on criteria for multiple colum

    New Question:
    I want to return the scaled score when there are duplicates. The criteria for which one to return is as follows:
    1. lowest standard error
    2. most recent
    3. highest score

    I've modified the formula from the previous response:
    =LARGE(IF($A$3:$A$11&$C$3:$C$11=$A15&SMALL(IF($A$3:$A$11&$B$3:$B$11=$A15&LARGE(IF($A$3:$A$11=$A15,$B$3:$B$11),1),$C$3:$C$11),1),$D$3:$D$11),1)
    such that all references to date and error have been switched.

    It appears to work, save for the orange highlighted instances in the attached worksheet.
    Can anyone tell me why these exceptions (in the worksheet) are occurring?

    Thanks,
    Michael
    Attached Files Attached Files

+ Reply to Thread

Thread Information

Users Browsing this Thread

There are currently 1 users browsing this thread. (0 members and 1 guests)

Bookmarks

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts

Search Engine Friendly URLs by vBSEO 3.6.0 RC 1