# Microsoft Office Application Help - Excel Help forum > Excel Formulas & Functions >  > [SOLVED] HELP!! "Calculating 4 threads" nightmare problem!

## jackblack2

I have the most irritating problem - every time I enter a value into a cell I get a "calculating 4 threads" notification and the system starts counting up from 1%,2%,3% and so on until 100% and this process takes a hundred seconds and sometimes even more and while it calculates, Excel freezes and I cannot do anything in the file! This results in a task that should take me one hour to do, now taking me 2 hours to do! Pardon my French, but WTF is going on! Is Excel re-routing the calculations through China! The problem is so severe that it will even slow down my internet feed and slow down other excel files I have open.

I had this problem in the past but it has gotten severe now. I was told that it has to do with formulas which are looping. The size of my Excel file is 6.3 MB and the file contains 190,000 formulas which were replicated in an incremental way (via the offset function in the formula) using the drag function. The more I replicated the formulas, the longer the calculations take.

I have tried copying the entire file into a new Excel file but a portion of the formulas are pasted as values and not formulas when I try and paste the formulas into a new file. A pop up informs me that it is not possible to paste the formulas because it has to do either with the fact that the formulas were pasted from a "protected view" file or from a different version of excel.

Please instruct me how to fix this most irritating problem. I have no knowledge about formulating formulas. The formulas were formulated for me. So please try and explain as simply as possible how I need to fix the problem.

----------


## FDibbins

OFFSET is a volatile function, and will cause a recalc for ANY change in the workbook.  Given that you have 190 000 formulas in the file, Im not surprised is is slow.

Can you upload a *small* (clean) sample workbook (not a pic) of what you are working with, and what your expected outcome would look like.

----------


## Roel Jongman

The "quick fix" without changing any of the formulas is by changing the recalculation of the sheet from automatic to manual. Then the sheet will not calculate on every change but only when you save the file or when you press F9

----------


## jackblack2

How do I change from automatic to manual? And does manual result in more work on my part?

----------


## Roel Jongman

Calculation mode can be set per worksheet thru options or Formulas menu. At the end of formulas menu there is the calculation section where you can change is. 
Under options there is also a formulas subsection where you can set is. 

More work is not so much the issue it takes a press of F9 to recalculate.

----------


## FDibbins

I would still suggest uploading a small sample, so we can review what you have  :Smilie:

----------


## jackblack2

@Roel Jongman

Thanks a million! That fix solved 90% of the problem. But is there a way to speed up the calculations, since I still can't escape the ridiculous amount of time I have to wait (can be over 100 seconds!) for the system to calculate the calculations after I click "calculate now". 

I cannot upload a small sample, since this is confidential work.

----------


## FDibbins

OK, so if no file, can you show some samples of your formulas?

----------


## davsth

you can change the information, a sample doesn't need to be sensitive data. but failing that a formula that is being used can be changed (perhaps) but without seeing a formula no one has an opportunity to help. As the file seems large, do some of the formula actually not change after the first time they are calculated, so could be changed to values?

----------


## MrShorty

Without a sample file or examples of your formulas, it will be difficult for us to be specific.

If it helps, here are common bottlenecks that I see on this forum:

1) Perhaps because almost all examples on the internet and in Excel's help files are for exact match linear (4th argument of VLOOKUP() is FALSE or 3rd argument of MATCH() is 0) searches, people tend to use these options in their lookups. Linear lookups are exceptionally slow (and especially noticeable if you are doing 190000 lookups over large ranges). Converting these to binary lookups is much faster.
2) Because we tend to want to cram as much calculation into single cell formulas, many times we create a lot of extra effort. This often creates duplicated effort, where thousands of copies of a formula are performing the exact same lookup/other calculation thousands of times. I have found many examples where finding this duplicated effort and moving it to helper cells/ranges vastly improves a spreadsheets performance. This example https://www.excelforum.com/excel-for...ows-excel.html takes the exact same multiple lookups per row and reduces it to one lookup per row using a helper cell. Here's another one (without specifics) that uses a helper column to go from "essentially unusable" to "nearly instantaneous" https://www.excelforum.com/excel-pro...n-a-macro.html
3) Pivot tables are much faster than worksheet formulas, and a lot of times these questions are about a spreadsheet that is doing pivot table work. If your desired outcome fits into what a pivot table can do, then I would restructure the spreadsheet to use a pivot table (in this example https://www.excelforum.com/excel-gen...la-faster.html I added a simple helper column to a table and then summarized using a pivot table which dramatically improved performance).

Those are the kinds of things I would be looking at.

Interestingly, I found this thread in my saved threads https://www.excelforum.com/excel-gen...so-slowly.html where it looks like we have kind of been over some  of this before with you. As before, if you cannot provide examples or make use of the resources we point you towards, I'm not sure how much help we can provide.

----------


## jackblack2

Sorry for the delay in replying.

Unfortunately I am not a computer expert and I don't have much knowledge of how to work with Excel other than basic functions. So I don't understand how to make use of the above advice of MrShorty. And unfortunately in order to protect my work I cannot provide an example of the formula but I will say that the formula is doing the same calculation (in an incremental fashion using the OFFSET function) over 150,000 times which is likely the reason the calculations take so long. Yet how is it possible that it would take up to 100 seconds?? Aren't today's modern laptops with 4 processors capable of doing millions of calculations per second. So is the file doing billions of calculations during that 100 second period?

Since the switch to manual calculations has increased the speed of the file back up to normal (in addition to the file freezing for 100 seconds, it also used to freeze for a couple of seconds every now and again and there was a split second delay every click of the mouse and it even slowed my internet feed when the file was open and switched to automatic) I would like to replicate the formula close to 730,000 times. But first I need to know what is the limit of columns in an up to date excel spreadsheet. I was told that there are up to 1 million rows in an excel spreadsheet yet what is the limit regarding amount of columns in an excel spreadsheet.

Thanks.

----------


## MrShorty

Excel specifications and limits https://support.office.com/en-us/art...7-269d656771c3 shows that a single tab has 1E6+ rows and ~16k columns.




> So is the file doing billions of calculations during that 100 second period?



 I would not be at all suprised if there are billions or even trillions of operations that your spreadsheet is triggering. Consider a hypothetical spreadsheet performing 100000 linear lookups on a 200000 row data set. I am not enough of a computer scientist to know exactly what constitutes an "operation" at the processor level, but it is easy to see that 100000 linear lookups across a 200000 row data set requires at least 100000*100000=1E10=10 billion operations just to perform the lookup. And that is not considering the operations required to set up the lookup (you say your formula is using OFFSET() functions to set up the ranges needed) and the operations/calculations being performed before and after the lookup and the "overhead" needed by Excel to retrieve data from memory, keep track of its calculation dependency tree and write results to the screen. Yes, I can easily conceive of a simple large spreadsheet that needs billions and trillions of operations to calculate. If you are needing to expand this to 750000 copies, then it is even easier to move into the trillion or quadrillions of operations.




> Unfortunately I am not a computer expert and I don't have much knowledge of how to work with Excel other than basic functions. So I don't understand how to make use of the above advice of MrShorty. And unfortunately in order to protect my work I cannot provide an example of the formula



 If you consider all of this (both data and spreadsheet formulas/design) to be intellectual property important enough to not share on the forum, and you cannot "mock up" fake data and similar formulas that do not violate your intellectual property rules, I am not sure how much help we can be. I would suggest that, because this same issue has come up before, it seems likely that it will come up again. It seems that it should be important you and those you work for that you become enough of a computer expert to learn how to build efficient spreadsheets. Somehow, it seems prudent that you find the time and training resources to develop some of this expertise.

FWIW, I don't think anything that has been suggested really involves anything beyond the basic Excel functions you are already using. Most of the time, it is about using those same basic functions differently. If IP rules really prevent you from sharing your formulas, I don't know how much we can help. Maybe you can at least suggest some of what your formulas are doing: Are there lookups in your formulas? Are there database summary functions (SUMIFS(), COUNTIFS(), SUMPRODUCT() or anything similar) in your formulas? Can you give us any other ideas of what is in your formulas? Can you tell us if you are familiar with Excel's database/pivot table tools, so we know if we can suggest that direction as a possibility?

----------


## jackblack2

> a single tab has 1E6+ rows and ~16k columns.







> I would not be at all suprised if there are billions or even trillions of operations that your spreadsheet is triggering.







> If you are needing to expand this to 750000 copies, then it is even easier to move into the trillion or quadrillions of operations.



These statistics are absolutely mindboggling! That first statistic means that if there are 50 spreadsheets in an excel workbook, then an excel workbook contains a surface area of over 50 square kilometers of cells. That's trillions of cells! I would then assume that an excel file can contain petabytes of data. One reason I can think of for why Microsoft would offer so many cells is for projects like CERN where petabytes of information need to be stored.

And I thought I would run out of columns (need 6000 of them).

The second statistic would explain why the calculations take so long to complete.

I raised the amount of formulas to 728,000 and when I clicked "calculate now" it took the system 3 full minutes to calculate and the file kept crashing every few seconds for 1-2 seconds. So here is a question I have (one might need to be a computer scientist to answer this question) - is there any risk of overheating the micro-processors of my laptop if they are processing quadrillions of calculations per second? I have read that the faster the flow of electrons through the circuitry, the more heat is produced. The last thing I want is to burn out my laptop.

As for what I can reveal about the formula - it is a COUNTIFS() formula and as mentioned it contains the OFFSET function twice in the formula.
I am not familiar withExcel's database/pivot table tools. Could they speed up the calculations?

----------


## Jacc

No risk of overheating, computers have heat sensors and are smart enough to speed up the fan or even shut down if the fan can't cope.
Mom's laptop used to shut itself down everytime you tried to play a youtube video. I took it apart and found a thick, solid blanket of cat hairs blocking the air to the cooling fins. Cleaned it and then it ran youtube all day.

----------


## Jacc

I'm amazed it can run 728 000 rows of formulas in 3 minutes, I wouldn't even consider that many lines of formula.

Will a binary file format help maybe? I have no experience myself, I rely on the seniors to comment on this. https://www.quora.com/What-is-the-pu...orkbook-format

----------


## jackblack2

@Jacc 

Glad to hear there is no risk of overheating




> I'm amazed it can run 728 000 rows of formulas in 3 minutes, I wouldn't even consider that many lines of formula.



Unfortunately there is some kind of looping issue since the formulas were originally pasted from an earlier version of Excel

Unfortunately I only know how to work an excel file at the basic level. So I don't know if a binary file format will help.

By the way, I suspect your mom's cat was watching too many cat videos on YouTube lol. Sorry, I had to.

----------


## Jacc

Ha ha... he probably was   :Smilie:  .

I kindly but annoyingly repeat the request of my Excelforum peers, create a completely new workbook, add some dummy data (just random letters or numbers) and type in the formula on those random data. 
If you make such a workbook and post it we will be able to give you much better help.
Also remove your name: https://superuser.com/questions/5045...l-2010-documen

----------


## jackblack2

Sorry, but I just can't take the risk. I cannot specify why. 

The best I can reveal is that it is a COUNTIFS() formula and it contains the OFFSET function twice. I dont believe the issue is actually with the formula but with the way it was pasted into my file, since when I try and paste the formula into a new workbook, I recieve a pop up that informs me that it is not possible to paste the formula so it will be pasted as a value. The pop up informs me this could be because the formula was pasted from a "protected view" file (was not), or from another application or from another version of excel (this might be the reason)

----------


## MrShorty

Counting "records" based on the values in different "fields" (introducing some database terminology) is one of the tasks pivot tables are supposed to do well on a well designed database table. If this is the kind of work you are doing I would guess that designing a good database and using Excel's database tools (like pivot tables) will dramatically improve the performance.

First common hurdle when using pivot tables is getting the source data into a nice database format. Here's a recent discussion about the best data layout for input into a pivot table: https://www.excelforum.com/excel-new...ta-layout.html Short summary of this discussion -- make a single flat list for the source data. 

Second is learning how to use pivot tables. Pivot tables are easy, and should require no more than a basic understanding of Excel. Here's one of many pivot table tutorials from around the internet: https://www.excel-easy.com/data-****...ot-tables.html

None of this should require advanced Excel or computer skills. It should only require that you learn to recognize a "database" task and learn a little about how to design a good database and how to use Excel's pivot table tools. As I noted with your previous thread, this does not seem like the first time (nor the last, I suspect) that you will deal with this kind of issue. So it seems like it should be of value to you to learn these simple skills so you can use them now and in the future.

----------


## jackblack2

@MrShorty

Thanks. I will look into it later. For the mean time I can deal with this issue if I switch off automatic calculations.

----------


## Jacc

One exciting countifs formula.
Did you paste the 200 000 rows of formula in one go? That sounds like a bad move. I would paste the formula in one cell, check that this was operating correctly, fill down another 10 lines or so, check again and then fill down the rest.

----------


## jackblack2

Actually the formula does not fill many rows but many columns. I originally pasted in only just about a hundred of them and then I replicated the formula hundreds of thousands of times by dragging it across the spreadsheet. I believe the problem is that it was pasted from a different version of excel.

I have another question - Since I now have hundreds of thousands of cells filled in with data, I need to use the find function on excel. So how do I instruct excel to search for a certain value only in a specific zone in the spreadsheet and not in the entire spreadsheet? I clicked "options" but there is no option to specify an zone you want to search in.

----------


## MrShorty

Which "find" or lookup function are you trying to use -- VLOOKUP(), HLOOKUP(), MATCH(), LOOKUP(), other? (For a look at Excel's lookup and reference functions, see the lookup and reference section of this help file https://support.office.com/en-us/art...1-63f26a86c0eb ). What you describe almost sounds like a 2D kind of lookup (find the column that will contain the datum, then find the datum within that column). In those cases, I tend to think first of using two MATCH() functions (one for column number and one for row number), then use an INDEX() function to retrieve the datum.

As this thread started with questions about calculation efficiency, I will remind you that lookups (especially linear "exact match" lookups) are really slow. It sounds like you will end up with a lot of data to search through. A lot depends on how much effort you want to go to in designing this, but it seems like there could be real value in really thinking through how you are doing this whole "summarize the data then find stuff in the result" operation to make sure it is efficient.

----------


## Jacc

If only one cell is selected, Excel will search the entire sheet.
If two or more cells are selected, Excel will search the selected cells only.

----------


## Pat and only Pat

uses
COUNTIFS
and
OFFSET

----------


## AliGW

Pat - if you have a solution to this particular issue, please outline it here. Don't try to send other members on a wild goose chase following links around the forums. Thanks.

----------


## Pat and only Pat

phrasing by OP suggests his formulas are proprietary -- i was merely pointing out that his formulas come from a response (elsewhere) on this forum -- so if anyone knows how to replace the *OFFSET* with something more efficient, please see those formulas in that discussion

----------


## Pat and only Pat

and no it is not "another issue" -- it is the same formulas (in another discussion launched by the same OP)

----------


## jackblack2

@Jacc





> If two or more cells are selected, Excel will search the selected cells only.



How do you select two or more cells? I have tried highlighting a group of cells by dragging the mouse (if you know what I mean) yet when I use the "find" function it still bring up a list of the value I am looking for but in the entire spreadsheet and not just the selected cells.

----------


## jackblack2

@AliGW

There are some replies on this thread for which a notification is not sent to my inbox despite checking the box that says notify instantly. Is there a glitch in the system?

----------


## AliGW

There has been an issue for some. Please comment in this thread (where the admins and tech guys will see it): https://www.excelforum.com/suggestio...-not-noti.html

----------


## Jacc

> @Jacc
> How do you select two or more cells? I have tried highlighting a group of cells by dragging the mouse (if you know what I mean) yet when I use the "find" function it still bring up a list of the value I am looking for but in the entire spreadsheet and not just the selected cells.



Select cells, hit Ctrl + F.

----------


## jason.b75

> I had this problem in the past but it has gotten severe now. I was told that it has to do with formulas which are looping. The size of my Excel file is 6.3 MB and the file contains 190,000 formulas which were *replicated in an incremental way (via the offset function in the formula)* using the drag function. The more I replicated the formulas, the longer the calculations take.







> As for what I can reveal about the formula - it is a* COUNTIFS()* formula and as mentioned it contains the OFFSET function twice in the formula.
> I am not familiar withExcel's database/pivot table tools. Could they speed up the calculations?



Didn't have time to read the whole thread which looks to contain a lot of good information and advice, so this is just a quick observation.

Using incrementing ranges with countifs, or similar functions appears to be a bad idea when you exceed around 20k formulas (some appear worse than others, possibly due to criteria hierarchy). When posting suggestions that require this I try to make use of other things to make it more efficient.

As your formula is dragable, it should be possible to define the offsets as named ranges, then use the named ranges in the counifs formula instead, which may improve things slightly.

Posting your formula, doctored if needed will give us some clues as to what can be done.

If you can't post your sheet names for any reason, then just rename them sheet1, sheet2, etc. We don't need to see any confidential data, or any data for that matter, just the formula.

----------


## jackblack2

Before I possibly screw myself over by posting the formula, I have a question - without the data which the formula refers to, is it possible for someone to figure out exactly what the formula does?

If not I will post it since the calculation time has become ridiculous - over 5 minutes to open, close or update data in the file.

----------


## jason.b75

Only you can answer that based on the likelyhood of other people performing the same task as you, given that we don't know what that is, the odds of a correct guess are potentially reduced significantly.

If for example, you post a formula that uses a structure specific to finance, then someone with a finance background might recognise the principles behind it.

It is more likely than 10, or even 100 people would be able to work out what the formula 'could' do, but each would have a different use for it based on their own ideas and experience.

----------


## jackblack2

Oh well, since the calculation time has become rediculious, here goes - =COUNTIFS($A$1:$A$19000, 9, OFFSET($A$1,(COLUMN()-COLUMN($A396))/3,0):OFFSET($A$1,18999+(COLUMN()-COLUMN($A396))/3,0), E795) This formula has been replicated around 1 million times in my file

----------


## jason.b75

The single biggest problem is the volume of countifs formulas in the sheet.

This might make a slight difference by reducing the number of calculations to set up the offset, but I expect that the effort that this saves will be dwarfed by the drain of 1 million countifs .

=COUNTIFS($A$1:$A$19000, 9, OFFSET($A$1:$A$19000,(COLUMN()-1)/3,0), E795)

----------


## jackblack2

If the new formula hardly helps, then I prefer not to tinkle with the formula, since if I corrupt it, I am in for a world of trouble. If all I have to deal with is two calculation sessions while I work on the file, then I can live with that.

----------


## MrShorty

A few things I am seeing in this formula:

Like jason.b75, I would have combined those two OFFSET()s into one, though I would have used the optional height and width arguments -- OFFSET($A$1,COLUMN()-1)/3,0,19000,1) OFFSET() help file: https://support.office.com/en-us/art...e-b4d906d11b66

Where you are dividing the column# by 3, I notice that OFFSET() truncates this value if it is ever a fraction. You don't say what columns you are copying this into. This means that these OFFSET() functions will only return a new range for every 3rd column they are pasted into. Perhaps you already know that and are pasting it only into appropriate columns. If you are copying into every column, know that three adjacent columns are exact copies of each other. Reduce computation effort by deleting two out of every three columns.

What I see this formula doing is counting how many times the number in E795 is a specified number of rows beneath the number 9 in column A. Does that sound correct? You did not want to bother with jason.b75's minor edit to the existing formula, so I don't know that this next observation is going to help any (for this project -- hopefully for future projects it will provide something to think about next time). As I noted in post #10 point number 2, a lot of slow spreadsheets are repeating the same exact task multiple times. In this case, for example, every single copy of that formula must check if the value in A1 is 9. A million copies of that formula means that Excel must check A1 a million times to see if A1 is or is not 9. If it was not 9 the first time, it is still not 9 the millionth time. Of course, this is true for every other entry in A, and then for every value represented by E795 in the formula. I think really speeding this up is going to require more than minor edits (which you are reluctant to try anyway), but a complete rewrite. An algorithm that will find all of the 9s in A once, then find the other values, then figure out how many rows separate them or something similar. I will emphasize again, I don't think this involves advanced Excel skills or computer skills. It is a matter of applying the computer/Excel skills you already know to develop a different approach to the problem. Let us know if you are interested in exploring this at all. Your last post suggests that you are content with what you have, if something faster is going to require major rewrites.

----------


## Pat and only Pat

where is the main cost?
is it the COUNTIFS -- or the *OFFSET*?

----------


## jackblack2

> This means that these OFFSET() functions will only return a new range for every 3rd column they are pasted into.



That is correct. 

As for rewriting the formula or making a minor edit, I prefer not to touch for fear of corrupting it. I can live with a 10 minute delay for the two calculations I have to do each time I update the file. 

I am going to mark this thread as solved and add to your reputations, since the advice of turning off automatic calculations fixed the problem by 90%.

Thanks everyone.

----------


## jason.b75

It is extremely easy to dupliacte files, i.e. to create a back up in case it does get corrupted.

----------


## jackblack2

Presuming that I do replace the 1 million formulas with the new more efficient formula, how much would you estimate the calculation time would be cut down to - 10%? 50% 90%? I am not going to spend a considerable amount of time dragging the formulas over a million cells if all I will get is a 10% reduction in calculation time.

----------


## jason.b75

We need to come up with a theory that will work to get any kind of idea on the benifit.

Looking back, I see refernce to 6000 columns, so I assume 2000 of those contain the formulas, which would mean 500 rows of the formula? Am I close?

Next assumption is that the 9's counted in the first countifs criteria are equally spaced, say 500 rows apart?

If this is consistent, then you should not need to even check for the presence of 9's in the range. If you know that 9 appears in A1,A501,A1001 etc, then you only need to check A2,A502,A1002 etc against E795, likewise A3,A503,A1003 against H795 when you copy the formula to the next column.

Likewise, if I'm following the concept of your formula correctly, when you drag the formula down you will always be looking at the same cells in column A, just changing the criteria to E796,E797 etc.

If your data patterns vary, then things get more complicated, but if they are consistent then you could potentially make a significant improvement.

----------


## jackblack2

> Looking back, I see refernce to 6000 columns, so I assume 2000 of those contain the formulas, which would mean 500 rows of the formula? Am I close?



more or less correct.





> Next assumption is that the 9's counted in the first countifs criteria are equally spaced, say 500 rows apart?



Incorrect. Not equally spaced and the value is not always 9 





> Likewise, if I'm following the concept of your formula correctly, when you drag the formula down you will always be looking at the same cells in column A, just changing the criteria to E796,E797 etc.



correct.





> If your data patterns vary, then things get more complicated, but if they are consistent then you could potentially make a significant improvement.



They vary greatly.

I have a question, why is dragging across the spreadsheet so much slower than when dragging down the spreadsheet?. Its really annoying.

----------


## jason.b75

I don't think that dragging across is any slower than dragging down when you look at what is being done like for like.

If you drag down 500 rows in 1 column then you have 500 formulas to calculate, then when you drag the column across, that increases to 1000, 1500, 2000 etc, the increases are much greater.

If you drag a single cell across first, then drag them all down together then that will be the slower part.

Going back to the original problem, I'll give it some thought based on your responses above, but might be good to see if any of the others come back with some suggestion before going too far. Sometimes 3 or 4 people can come up with some ok or good ideas that can be put together to make a great one.

----------


## jackblack2

> I don't think that dragging across is any slower than dragging down when you look at what is being done like for like.



I have experimented with dragging empty cells across the sheet and down the sheet and still get a speed of 20 cells or columns per second as apposed to over a hundred cells or rows per second when dragging down.

----------


## jason.b75

With a maximised window?

Reduce the window size and move it left, then drag across and move the cursor outside of the visual boundary of excel.

It is not the fill rate, but the rate at which the sheet scrolls, the further you move the cursor beyond the boundary of the last visible cell, the faster it goes.
Scrolling down you have the excel scroll bar, the excel status bar and the windows taskbar outside of that boundary to cross with the cursor, scrolling across, you only have the width of the scroll bar.

I think column width might make a difference as well, narrower columns moving faster than wide ones, but haven't tested the theory.

----------


## jackblack2

Bingo! reducing the window size speeds up the dragging. Thanks!

----------


## jason.b75

Just looking over things again.





> Next assumption is that the 9's counted in the first countifs criteria are equally spaced, say 500 rows apart?
> Incorrect.
> 			
> 		
> 
> 
>  Not equally spaced and the value is not always 9



Is the sample formula specific to one, or a small number of columns, with the hardcoded criteria of '9' changing at some point?

Going back to the earlier posts when you were describing the formula, I was under the impression that it was the same formula copied across the whole sheet, with the relative cell references being the only variable.

----------


## jackblack2

I cannot get more specific than I already have, since I have to protect my work.

----------


## jason.b75

Fair enough. We've tried to help you out, but your reluctance to provide even the slightest of information makes it near impossible.
Maybe others will be willing to keep trying, but I'm done.

----------


## jackblack2

Thanks anyway. I'll go ahead and mark this thread as solved, since I now know to turn off automatic calculations, how to find a value only in a selected area, and how to speed up when dragging across the spreadsheet.

----------


## MrShorty

I'm a little late to today's discussion, but somebody asked if I would share my ideas for approaching this. I will throw this idea out:

First, you are wise to consider programming effort against how much performance improvement there will be. My mentor commented in one of his programs (old Fortran) that he recognized that what he did was slow for the computer, but he felt that the approach he used was easier to program, debug, and edit. He did not want to invest a lot of effort into making the code more efficient if it was going to mean too much more work for him. I think you are right in the middle of that kind of conundrum here. If you are content to operate in manual calculatioin mode and manually force calculation when needed, that might be the best way to proceed for this particular project.

Somebody asked what I thought I would do. I haven't put a great deal of thought into this, but, as I wrote earlier, I think there is a lot of duplicated effort (is A1 still 9?) in the multiple countifs. I would try an approach that will try to eliminate (or at least minimize) repeatedly checking the same cells for a given value. As a brief overview/outline:

0) To simulate the input data for the formula, I envision rolling a 20 sided die ~40000 times. The formula is then answering questions like "how often did I roll a 10 immediately after rolling a 9?" or "How many times did I roll a 5 ten rolls after rolling a 15?" Of course, if it is a fair die, then I expect no patterns to emerge, but it provides a reasonable scenario for generating some test data and testing the algorithm I might develop.
1) Make a horizontal copy of the raw data (since Excel only allows ~16000 columns, this will require a few "blocks" of cells).
2) In the body of this "table", I create a formula that will create a "2D" block of cells that will uniquely identify each combination of "left" number and "top" number. In my simulation, each number is no more than two digits, so maybe something like =100*left+top. This will mean that, for the first question in (0), I would be counting how many times the value 910 occurs in the column/row that represents 1 row offset. The second question would be to count how many times the number 1505 occurs in the row/column that represents 10 rows offset. If the input data will never (or rarely) change, then I can copy and paste as values these combination numbers to reduce any need to recalculate them (like when opening/closing the workbook). I could even see some value in separating spreadsheets to have one to create the 2D block(s) of cells from the spreadsheet that analyzes the results.
3a) From there, COUNTIFS() could be used to count the desired combinations.
3b) Or, if the goal is count the result for every possible combination of number pairs and offsets, then I would consider a FREQUENCY() function (https://support.office.com/en-us/art...7-fd9ea898fdb9 ) or something similar that can calculate all of the counts in one go.

I am optimistic (perhaps too optimistic) that there are significant efficiency improvements to be made using something like this. However, it is fair to note that, if this really is a "count every possible combination of x,y,z", these kind of problems -- even when programmed at best efficiency -- become very large very quickly.

It's a quick overview, and will likely require a bit of time/effort to develop and debug it. So you need to decide how much time and effort you want to invest now so you will have less time in the future waiting for the spreadsheet to calculate. At present, you seem mostly content with the time spent waiting -- you are not yet bothered enough to invest this kind of time and effort.

----------


## jackblack2

Before I would consider reformulating, I would need to know a rough estimate of how much time I would save per calculation. It currently takes 300 seconds (5 minutes) when I click calculate now. So would I save a few seconds or a few minutes if I use the above more effecient formula? 

In the future I could increase the number of formulas to 3 million. So I might have no option but to make the formula more efficient.

----------


## Pat and only Pat

> I cannot get more specific than I already have, since I have to protect my work.



you could've pointed to the source of these formulas --



> You can use *OFFSET* combined with a COLUMN calculation to get the rows to increment as you go across columns.
> 
> Put this in F2:
> 
> 
> ```
> Please Login or Register  to view this content.
> ```
> 
> ...

----------


## jackblack2

I realize that. But I prefered the source not be revealed. Oh well.

----------


## MrShorty

> Before I would consider reformulating, I would need to know a rough estimate of how much time I would save per calculation. It currently takes 300 seconds (5 minutes) when I click calculate now. So would I save a few seconds or a few minutes if I use the above more effecient formula?



I quickly built a spreadsheet using my suggestion that this could be simulated by rolling a 20 sided die multiple times and looking at that sequence of numbers. In this spreadsheet, I assumed 1000 die rolls (counting over blocks of 500 rows). 

To approximate your scenario, I built a quick COUNTIFS(...,OFFSET(...)...) formula similar to yours and made 60000 copies of this formula. Each calculate event took about 16 seconds. Also, as noted in post #40, because OFFSET() is volatile, every edit to the spreadsheet -- even those that had not impact on the counting values -- was followed by a 16 second calculate event.

To compare I built a calculation sequence like I describe above. For the same 60000 counts, this sequence took about 1 second to calculate. Because it did not include any volatile functions, edits to the spreadsheet that did not impact the source data were followed by instantaneous calculate events. Assuming the calculation times translate over to your setup, that means taking your 300 second calculation time down to something like 15 or 20 seconds.

As another test, I built a database based on the die rolls and used a pivot table to count the different possibilities. This pivot table took ~3 seconds to refresh.

In short, I think there are some significant improvements that can be made, if you want to invest the time and effort.

----------


## jackblack2

That is certainly a major improvement. So if I paste the new more efficient formula above into the first column and drag it across, will it increment and replace all the other formulas and show exactly the same values?





> every edit to the spreadsheet -- even those that had not impact on the counting values -- was followed by a 16 second calculate event.



This is the reason why I had to switch to manual calculations. Even opening another excel workbook while the one with the formulas in it was also open, caused the workbook to automatically begin calculating. Very annoying.

----------


## MrShorty

What I am proposing in post #54 is more than just a single, new, more efficient formula that simply plugs into your existing spreadsheet. What I am proposing is more of a complete rewrite of this part of your spreadsheet that breaks the counting task down into multiple, simple steps (in helper ranges). Go back to my post #54 and carefully read it, and see if you feel up to this kind of rewrite.

----------


## jackblack2

Here's the problem - I have no background at all in Excel. I am afraid of making a mistake that will cost me dearly. I prefer to do it the Kurdish way (as we say over here) and simply drag the new formula over the old one and make sure that the values are identical to the values with the former formula.

----------


## MrShorty

In that case, what I am proposing is not going to work. I didn't think anything I proposed required more background than you obviously have already, but it will require time to build and test and debug. You must decide how much you are willing to invest to get the file to calculate faster.

----------


## jackblack2

If the calculation time even in manual mode gets rediculious, I will take up your offer later. For now thanks for all the effort.

----------


## bamboomy

Also (I didn't read the whole thread so it's possible I repeat someone's answer but):

Take care also for field accesses on disk:

eg:

I had also the 4 Threads issue,
and in my case an open file accessed another file which wasn't open (ie, it continued to do disk accesses in stead of memory accesses)
(a LOT of times)
if your file accesses a file which isn't open, it takes a whole while longer to access that field
compared to when the same file is opened in excel (in memory).

In computer time: if you read something from disk it takes approximately 1000x more time than accessing something from memory...

So, if file A access file B, make sure its open in excel otherwise you wait a whole while longer to have the same result.
(I opened file B and after that you couldn't even blink during the same operation...)

m2c,

S.

----------

