+ Reply to Thread
Results 1 to 12 of 12

Finding uninterrupted sub-arrays in Excel - Kadane's algorithm variation?

Hybrid View

  1. #1
    Registered User
    Join Date
    04-21-2020
    Location
    Australia
    MS-Off Ver
    2010
    Posts
    5

    Finding uninterrupted sub-arrays in Excel - Kadane's algorithm variation?

    Hi guys,

    Here is my problem that I can't seem to find a solution to:

    Suppose you have an ordered, indexed list of positive values. These positive values are interrupted by 0 values. You want to determine if a consecutive sub-array exists which is not interrupted by 0 values and whose sum exceeds a certain threshold.

    Simple example:
    Index, Value
    0   0
    1   0
    2   3
    3   4
    4   2
    5   6
    6   0
    7   0
    8   0
    9   2
    10  3
    11  0
    In the above example, the largest consecutive sub-array not interrupted by 0 is from index 2 to index 5 inclusive, and the sum of this sub-array is 15.

    Thus, for the following thresholds 20, 10 and 4, the results should be FALSE, TRUE and TRUE respectively.

    Note I don't necessarily have to find the largest sub-array, I only have to know if any uninterrupted sub-array sum exceeds the defined threshold.

    I suspect this problem is a variation of Kadane's algorithm, but I can't quite figure out how to adjust it.

    The added complication is that I have to perform this analysis in Excel or Google Sheets, and I cannot use scripts to do it - only inbuilt formulas.

    I'm not sure if this can even be done, but I would be grateful for any input.

    PS: I have found the topic "Maximum Sub-array Problem... Kadane's algorithm?" (sorry, not allowed to post links yet), but still could not work out how to adjust it to my problem.

  2. #2
    Forum Expert XLent's Avatar
    Join Date
    10-13-2010
    Location
    Northumberland, UK
    MS-Off Ver
    various
    Posts
    2,706

    Re: Finding uninterrupted sub-arrays in Excel - Kadane's algorithm variation?

    I can see smarter folk than me looking at this but the below would be one way to isolate the discrete sums of the 1 to n arrays, against which you could apply threshold logic.

    Formula: copy to clipboard
    cell of choice
    =SUBTOTAL(9,OFFSET($B$1,AGGREGATE(15,6,ROW($B$2:$B$13)/(($B$2:$B$13<>0)*(N(+$B$1:$B$12)=0))-1,ROW(A$1:INDEX($A:$A,SUMPRODUCT(($B$2:$B$13<>0)*($B$1:$B$12=0))))),0,1+AGGREGATE(15,6,ROW($B$1:$B$12)/(($B$2:$B$13=0)*(N(+$B$1:$B$12)<>0)),ROW(A$1:INDEX($A:$A,SUMPRODUCT(($B$2:$B$13<>0)*($B$1:$B$12=0)))))-AGGREGATE(15,6,ROW($B$2:$B$13)/(($B$2:$B$13<>0)*(N(+$B$1:$B$12)=0)),ROW(A$1:INDEX($A:$A,SUMPRODUCT(($B$2:$B$13<>0)*($B$1:$B$12=0))))),1))

    so, applying an outer MAX to the above, coupled with a final >= test would give you the boolean response [would need to be confirmed with CTRL + SHIFT + ENTER]

    edit: I can't vouch for Google Sheets feasibility as I don't use it, I'm afraid.
    Attached Files Attached Files
    Last edited by XLent; 04-21-2020 at 07:52 AM. Reason: added CSE point re: MAX etc.

  3. #3
    Forum Expert XOR LX's Avatar
    Join Date
    04-18-2013
    Location
    Turin, Italy
    MS-Off Ver
    Office 365
    Posts
    7,742

    Re: Finding uninterrupted sub-arrays in Excel - Kadane's algorithm variation?

    Not particularly happy about the double-OFFSET, but as a first attempt, based on data in A1:B13 (with headers in row 1), array formula**:

    =MAX(SUBTOTAL(9,OFFSET(OFFSET(B2,A2:A14,),,,-FREQUENCY(IF(B2:B13,A2:A13),IF(B3:B14=0,A2:A13,0))-1)))

    Note the deliberately offset ranges (A2:A13, A2:A14, B2:B13, B3:B14).

    Regards


    **Array formulas are not entered in the same way as 'standard' formulas. Instead of pressing just ENTER, you first hold down CTRL and SHIFT, and only then press ENTER. If you've done it correctly, you'll notice Excel puts curly brackets {} around the formula (though do not attempt to manually insert these yourself).
    Click * below if this answer helped

    Advanced Excel Techniques: http://excelxor.com/

  4. #4
    Forum Expert XOR LX's Avatar
    Join Date
    04-18-2013
    Location
    Turin, Italy
    MS-Off Ver
    Office 365
    Posts
    7,742

    Re: Finding uninterrupted sub-arrays in Excel - Kadane's algorithm variation?

    Edit: don't need the double-OFFSET after all!

    =MAX(SUBTOTAL(9,OFFSET(B2,A2:A14,,-FREQUENCY(IF(B2:B13,A2:A13),IF(B3:B14=0,A2:A13,0))-1)))

    Regards

  5. #5
    Registered User
    Join Date
    04-21-2020
    Location
    Australia
    MS-Off Ver
    2010
    Posts
    5

    Re: Finding uninterrupted sub-arrays in Excel - Kadane's algorithm variation?

    Wow - you guys are blowing my mind. I'm definitely no novice in Excel, but there is no way I would have come up with these solutions!!! Thank you so much! I dove deep into XOX LX's solution and it took me about an hour to wrap my head around it...

    One question though: Couldn't I just use sum instead of subtotal? I've tested in and it seems to work?

  6. #6
    Forum Expert XOR LX's Avatar
    Join Date
    04-18-2013
    Location
    Turin, Italy
    MS-Off Ver
    Office 365
    Posts
    7,742

    Re: Finding uninterrupted sub-arrays in Excel - Kadane's algorithm variation?

    With my formula you cannot replace SUBTOTAL with SUM, no. You could have a particular set of values for which both versions give identical results, I suppose, though that would be pure coincidence.

    For the original data you posted, for example:

    =MAX(SUBTOTAL(9,OFFSET(B2,A2:A14,,-FREQUENCY(IF(B2:B13,A2:A13),IF(B3:B14=0,A2:A13,0))-1)))

    returns 15, whereas:

    =MAX(SUM(OFFSET(B2,A2:A14,,-FREQUENCY(IF(B2:B13,A2:A13),IF(B3:B14=0,A2:A13,0))-1)))

    returns 11.

    Unlike SUM, SUBTOTAL can return an array of sums, which can then be passed to a further function, e.g. MAX.

    Regards

  7. #7
    Registered User
    Join Date
    04-21-2020
    Location
    Australia
    MS-Off Ver
    2010
    Posts
    5

    Re: Finding uninterrupted sub-arrays in Excel - Kadane's algorithm variation?

    Quote Originally Posted by XOR LX View Post
    With my formula you cannot replace SUBTOTAL with SUM, no. You could have a particular set of values for which both versions give identical results, I suppose, though that would be pure coincidence.

    For the original data you posted, for example:

    =MAX(SUBTOTAL(9,OFFSET(B2,A2:A14,,-FREQUENCY(IF(B2:B13,A2:A13),IF(B3:B14=0,A2:A13,0))-1)))

    returns 15, whereas:

    =MAX(SUM(OFFSET(B2,A2:A14,,-FREQUENCY(IF(B2:B13,A2:A13),IF(B3:B14=0,A2:A13,0))-1)))

    returns 11.

    Unlike SUM, SUBTOTAL can return an array of sums, which can then be passed to a further function, e.g. MAX.

    Regards
    You are correct of course. I had manually checked it when rebuilding the formula, but stopped short before applying the MAX function. Makes sense, thank you.

    One complication I just came across is that the formula does not work in Google Sheets. I think this is because the FREQUENCY function works slightly differently in Sheets. Specifically, it sorts the bins by value, i.e. it puts all the 0 values at the beginning of the returned array.

    From Google Sheets Help:

    Syntax
    FREQUENCY(data, classes)

    data - The array or range containing the values to be counted.

    classes - The array or range containing the set of classes.

    classes should be sorted for clarity, but FREQUENCY will sort the values specified internally if they are not and return correct results.
    I'm thinking I may have to rebuild the frequency function from scratch without sorting, e.g. using countifs - or is there a simpler way to do this?

  8. #8
    Forum Expert XOR LX's Avatar
    Join Date
    04-18-2013
    Location
    Turin, Italy
    MS-Off Ver
    Office 365
    Posts
    7,742

    Re: Finding uninterrupted sub-arrays in Excel - Kadane's algorithm variation?

    Problematic!

    I'll have a look and see if I can find an alternative construction that works in Google Sheets.

    I have to say that Sheets' lack of an equivalent to the Excel Evaluate Formula tool is a huge oversight, especially when working with complex constructions such as this one.

    Regards

  9. #9
    Forum Expert XOR LX's Avatar
    Join Date
    04-18-2013
    Location
    Turin, Italy
    MS-Off Ver
    Office 365
    Posts
    7,742

    Re: Finding uninterrupted sub-arrays in Excel - Kadane's algorithm variation?

    The issue appears to lie not only with the FREQUENCY function but also with how OFFSET functions within Sheets.

    If FREQUENCY were the only issue, hard-coding the correct output from that function should theoretically work. However:

    =ArrayFormula(MAX(SUBTOTAL(9,OFFSET(B2,A2:A14,,{-1;-1;-1;-1;-1;-5;-1;-1;-1;-1;-3;-1;-1}))))

    returns #VALUE!.

    I believe I've come across this issue before, though don't recall finding a satisfactory explanation.

    Regards

  10. #10
    Registered User
    Join Date
    04-21-2020
    Location
    Australia
    MS-Off Ver
    2010
    Posts
    5

    Re: Finding uninterrupted sub-arrays in Excel - Kadane's algorithm variation?

    Quote Originally Posted by XOR LX View Post
    The issue appears to lie not only with the FREQUENCY function but also with how OFFSET functions within Sheets.

    If FREQUENCY were the only issue, hard-coding the correct output from that function should theoretically work. However:

    =ArrayFormula(MAX(SUBTOTAL(9,OFFSET(B2,A2:A14,,{-1;-1;-1;-1;-1;-5;-1;-1;-1;-1;-3;-1;-1}))))

    returns #VALUE!.

    I believe I've come across this issue before, though don't recall finding a satisfactory explanation.

    Regards
    Yes, you are correct! I will look into this a bit more to see if I can understand why the Offset function doesn't work as expected.

  11. #11
    Forum Expert XOR LX's Avatar
    Join Date
    04-18-2013
    Location
    Turin, Italy
    MS-Off Ver
    Office 365
    Posts
    7,742

    Re: Finding uninterrupted sub-arrays in Excel - Kadane's algorithm variation?

    Ah, ok. It's because, in Sheets, OFFSET's height parameter must be positive.

    In fact, it's supposed to be in Excel as well!

    "Height Optional. The height, in number of rows, that you want the returned reference to be. Height must be a positive number."

    (From https://support.office.com/en-us/art...e-b4d906d11b66.)

    It appears that a long-standing 'bug' allows negative values for this parameter in Excel:

    https://answers.microsoft.com/en-us/...b-68b599b31bf5

    First I knew! I've used this property on several occasions.

    Obviously Google fixed this bug in their equivalent.

    Will continue to look for an alternative 'non-buggy' solution!

    Cheers

  12. #12
    Forum Expert XOR LX's Avatar
    Join Date
    04-18-2013
    Location
    Turin, Italy
    MS-Off Ver
    Office 365
    Posts
    7,742

    Re: Finding uninterrupted sub-arrays in Excel - Kadane's algorithm variation?

    Hi,

    I've managed to come up with another Excel-based solution which this time does not use FREQUENCY. However, it does use OFFSET and, again, it fails in Sheets due to the apparent difference in functionality of that function across the two platforms.

    I've posted a thread re OFFSET in Google Sheets here:

    https://www.excelforum.com/for-other...arameters.html

    I'm afraid that, unless someone can confirm that it is in fact possible to pass arrays to OFFSET's parameters within Google Sheets, I doubt that I will be able to provide you with a solution which works there.

    Regards

+ Reply to Thread

Thread Information

Users Browsing this Thread

There are currently 1 users browsing this thread. (0 members and 1 guests)

Similar Threads

  1. Replies: 3
    Last Post: 09-26-2018, 07:34 AM
  2. Maximum Sub-array Problem... Kadane's algorithm?
    By cpinault2008 in forum Excel Formulas & Functions
    Replies: 7
    Last Post: 01-13-2015, 05:10 PM
  3. Replies: 5
    Last Post: 08-06-2014, 04:19 AM
  4. [SOLVED] Finding standard error of variation when z-score and population are known?
    By rowanbarnes in forum Excel Formulas & Functions
    Replies: 6
    Last Post: 10-17-2013, 05:52 AM
  5. finding uninterrupted group of cells containg data - like autoSUM
    By flatflo in forum Excel Programming / VBA / Macros
    Replies: 1
    Last Post: 11-28-2012, 05:34 AM
  6. Finding Duplicates Variation
    By Vilac7 in forum Excel General
    Replies: 2
    Last Post: 11-18-2012, 01:10 PM
  7. Finding the last value that has a variation of n%
    By mattkatt in forum Excel General
    Replies: 1
    Last Post: 03-05-2010, 05:24 PM

Bookmarks

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts

Search Engine Friendly URLs by vBSEO 3.6.0 RC 1