+ Reply to Thread
Results 1 to 6 of 6

Speed up a 100k AverageIf formulas

  1. #1
    Forum Contributor
    Join Date
    10-11-2008
    Location
    Portugal
    MS-Off Ver
    Excel 2010
    Posts
    111

    Speed up a 100k AverageIf formulas

    Hi,

    I have this file in which I want to use the averageifs formula to process some raw data (sheets 1 and 2).

    The thing is, with over 100k calculations to do, it takes literally AGES to process all the data (somewhere between 1h-2h), and I need a way to speed things up, maybe through VBA. Any ideas on how to accomplish this? I read somewhere that using arrays in VBA is a much quicker process than using only Excel, but I don't know if this is the case.

    Can you help?

    Thanks

    PS: I've attached a sample file with 200 rows, in which I created a macro to count the time it takes to make all those calculations. On my PC it takes little over than 2 seconds. The problem is that having 9000 rows doesn't take 2 * 9000 / 200 seconds...
    Attached Files Attached Files
    Last edited by Coaster; 08-03-2010 at 09:25 PM.

  2. #2
    Forum Guru DonkeyOte's Avatar
    Join Date
    10-22-2008
    Location
    Northumberland, UK
    MS-Off Ver
    O365
    Posts
    21,531

    Re: Is it possible to speed up a +100k AverageIfs formulas?

    The first quick-win in these scenarios is to remove the requirement for the multi conditional test in the first instance.

    Using your sample file - if you concatenate the fields of interest to a "key" on Sheet 1 then you can revert to AVERAGEIF using a single test based on the key.

    Please Login or Register  to view this content.
    You can then revert your formulae to:

    Please Login or Register  to view this content.
    In terms of performance - on my machine your original matrix took around 4.5 seconds to execute whereas utilising the above the calc time drops to around 0.6 seconds.

    If the performance gains remaining insufficient then the next thing to do would be to look into sorting your data on Sheet 1 (appropriately).
    Sorting permits use of Binary Search based calculations which on large sets invariably offer some efficiencies and also has other advantages too regards helper calcs to reduce burden.
    Last edited by DonkeyOte; 07-30-2010 at 02:11 AM. Reason: typo in narrative

  3. #3
    Forum Contributor
    Join Date
    10-11-2008
    Location
    Portugal
    MS-Off Ver
    Excel 2010
    Posts
    111

    Re: Is it possible to speed up a +100k AverageIfs formulas?

    thank you for your reply,

    i've done some modifications to the file: included a progress bar (on the status bar) and used your suggestion on the averageif formula.

    it's true: calculation takes less than 1 sec (0.8 on my pc).

    however, with the original file, which has 9000 rows to calculate on sheet 3, it takes 14 min

    this is still a lot and I would like to reduce this...could you explain your remark on data sorting a little further? I actually tried sorting the data in sheet 1 to see if it would take less time, but it didn't...are you talking about something else?

    again, thank you!
    Attached Files Attached Files

  4. #4
    Forum Guru DonkeyOte's Avatar
    Join Date
    10-22-2008
    Location
    Northumberland, UK
    MS-Off Ver
    O365
    Posts
    21,531

    Re: Is it possible to speed up a +100k AverageIfs formulas?

    With the data sorted you can dispense with the need for the AVERAGE if altogether - if you're prepared to use additional helper calcs.

    It's often the case that a greater number of lightweight calcs will perform far better than fewer more elegant yet expensive formulae.

    Illustrating via your latest sample

    Please Login or Register  to view this content.
    what this does is essentially conduct the average on the sheet itself but only looking at a small subset of the data (once per Joint@OutputCase combination) and using binary search to identify the ranges... this means these calculations are incredibly fast.

    With the above in place all you need do now is modify the formula on your summary sheet to utilise these values and again use Binary Search based Lookup to retrieve the appropriate value.

    Please Login or Register  to view this content.
    The above performs two binary search based lookups - the first retrieves the last average value listed where the key code is <= $A4&"@"&$F$2.
    It follows that given LOOKUP is approximate match test the resulting value may or may not be for this specific key so we multiply the result by whether or not the key matches our criteria (0 result if not).

    It's very important to note that this type of LOOKUP is blindingly fast so even though we're conducting two LOOKUPs they will be significantly quicker than the single AVERAGEIF calc being performed previously.

    With the above approach implemented in your file the calculation time for the matrix dropped from 0.6 seconds to between 0.02 and 0.04 seconds ... ie very very fast and a big % gain.

    The key is all calculations in place are now very light weight and as such will perform very efficiently - even though there are more of them.
    Last edited by DonkeyOte; 08-02-2010 at 12:16 PM. Reason: typo in narrative

  5. #5
    Forum Contributor
    Join Date
    10-11-2008
    Location
    Portugal
    MS-Off Ver
    Excel 2010
    Posts
    111

    Re: Is it possible to speed up a +100k AverageIfs formulas?

    DEAR GOD! this is some piece of magic coding right there! it's now LIGHTNING fast!

    haven't quite figured out what you did there, i'll study the formulas better but lord....you turned a process that took 14min down to 3 seconds!!

    i cannot find the right words to appreciate your work! THANK YOU!

    i'll save hours if not days with your suggestion!

    i bow to you!

  6. #6
    Forum Guru DonkeyOte's Avatar
    Join Date
    10-22-2008
    Location
    Northumberland, UK
    MS-Off Ver
    O365
    Posts
    21,531

    Re: Speed up a 100k AverageIf formulas

    Great news.

    There are a few examples of these types of gains on this board and in general all follow the same pattern that is to say:

    -- sorting data (allowing for binary search methods)
    -- use of basic & efficient helpers
    -- avoiding multi conditional summary calculations en masse or in conjunction with large data sets (avoiding Volatility)

    Charles Williams' site is excellent resource for all things calculation if you're interested in optimisation etc:

    http://www.decisionmodels.com/optspeed.htm
    http://www.decisionmodels.com/calcsecrets.htm

+ Reply to Thread

Thread Information

Users Browsing this Thread

There are currently 1 users browsing this thread. (0 members and 1 guests)

Bookmarks

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts

Search Engine Friendly URLs by vBSEO 3.6.0 RC 1