+ Reply to Thread
Results 1 to 6 of 6

Regression failure - but why?

  1. #1
    Registered User
    Join Date
    01-08-2020
    Location
    Denmark
    MS-Off Ver
    Office 365
    Posts
    3

    Regression failure - but why?

    I'm trying to run a regression analysis on the data, but everytime I include December, it somehow fails with #NUM on the regression.

    If I exclude december tho, it success, but I need the last month in order to make the function.

    I have run a value() on all the numbers to make sure it's numeric.

    I have success with these combinations:

    Sales (Y) + C20 (X) + Temp (X) +Months -December (X) => Works fine!
    Sales (Y) + All months (X) => Works fine!
    Sales (Y) + All variables (X) => Doesn't work


    Can anyone tell me why?
    Attached Files Attached Files

  2. #2
    Forum Guru
    Join Date
    04-13-2005
    Location
    North America
    MS-Off Ver
    2002/XP, 2007, 2024
    Posts
    16,380

    Re: Regression failure - but why?

    Your sample file only includes your raw data with none of your attempts at the regression, so it is hard to know exactly what you did.

    I did not have any trouble with the regression using the LINEST() function. In R3:R24, I array entered the formula =TRANSPOSE(LINEST($B$3:$B$962,$C$3:C962,TRUE)) and got a normal regression output. Copied across until column AE (so each column adds another X predictor variable to the regression) and each column showed a normal regression result. The last column (column AE) returned 0 for one of the coefficients, suggesting that Excel had detected collinearity for one of the predictor variables.

    I cannot say why the regression is failing for you, because it seems to work fine for me using the LINEST() function.
    Quote Originally Posted by shg
    Mathematics is the native language of the natural world. Just trying to become literate.

  3. #3
    Registered User
    Join Date
    01-08-2020
    Location
    Denmark
    MS-Off Ver
    Office 365
    Posts
    3

    Re: Regression failure - but why?

    Sorry for missing examples.

    I have attached a new file with my regression examples. Forgot to mention I used data analysis toolpack.

    I see the transpose,linest function works to get coefficients but I need the p-value as well to confirm the analysis, which is why I use data analysis for the summary output.
    Attached Files Attached Files

  4. #4
    Forum Guru
    Join Date
    04-13-2005
    Location
    North America
    MS-Off Ver
    2002/XP, 2007, 2024
    Posts
    16,380

    Re: Regression failure - but why?

    From the OP, I was under the impression that the regression failed completely, but it looks to me like the regression ran just fine -- except for the p-value for February. I have no explanation for the error in that cell. As I noted above, the regression algorithms detect collinearity, so Excel assigns 0 values to January which naturally leads to errors in that row.

    I will note that, in my older version of Excel, when I run the regression it correctly calculates the p value for February. Is this a bug introduced between my older version and your newer version? If you were determined to track the error down, we would probably need to try the regression across multiple computers and versions to see how reproducible the error is and see if it correlates with version or something else.

    From previous research (https://www.excelforum.com/excel-gen...ml#post4350280 ), I know that these p-values are simply from the t distribution, so they could be calculated outside of the regression utility using the TDIST() function. The easiest solution might be to calculate the t-distribution p-values separate from the regression output, using the other data (or using the LINEST() function, as I describe in the other thread).

    In summary, the error for the "January" p-value is due to the collinearity Excel detected. I don't know the reason for the error in the "February" p-value, but I note that I do not recreate the error in my version of Excel. When you understand the statistics behind these calculations, they are easy to create from the other regression outputs using the TDIST() function, so it may be preferable to compute p outside of the regression -- whether you decide to use the Regression Data Analysis tool or the LINEST() worksheet function.

    How would you like to proceed?

  5. #5
    Registered User
    Join Date
    01-08-2020
    Location
    Denmark
    MS-Off Ver
    Office 365
    Posts
    3

    Re: Regression failure - but why?

    The reason I want to use the data toolpack is cause I can see if all the variables fit the model.

    From the P-values of each model, i would be able to accept or discard the variable. If I do it on a single variable at the time, I don't get the full impact of multi variables regression, which is my goal here.

    I have no idea why the dummy months I have created, somehow cant be mixed into the regression.

    If I just use the Y-data on all the months, it works fine, but ERROR occurs when I fit in C20 or temperature numbers.

    I know i'm not hitting a max on regression, since i've read it's 15 variables and I only have 14.


    All I need is adjusted R on the whole model, Significance F, all P-values, coefficients and Intercept.
    Using analysis toolpak would be the easy solution, but something is wrong and I dont know.

    Can Linest or Tdist get me those numbers I want from above?

  6. #6
    Forum Guru
    Join Date
    04-13-2005
    Location
    North America
    MS-Off Ver
    2002/XP, 2007, 2024
    Posts
    16,380

    Re: Regression failure - but why?

    I'm not sure I understand your reticence to use the LINEST() function, but, for simplicity, let's focus on the regression utility output.

    The only errors I see in your regression output are E20 and E21 -- the P-values for the January and February coefficients. Because Excel detected collinearity in the data, it forced January's coefficient and error to 0, and the resulting P-value cannot be calculated. This is why E20 has an error in it. I do not know why E21 has an error, but I know that the P-value is calculated from the t-distribution =TDIST(ABS(t-stat),residual_df,2) (if you are unfamiliar with this part of the "slope test", you might review this tutorial: https://stattrek.com/regression/slope-test.aspx ). I can enter =TDIST(ABS(D21),$B$13,2) into K21 (or other convenient cell) and compute the correct P-value for February's coefficient. I can copy this formula up/down into K17:K31 to get the P-value for each coefficient and verify that the results match the values in E17:E31, if desired.

    If I understand what you are needing to do, that one addition should allow you to compute the missing P-value and perform the analysis. There are other ways to do this, but that seems the easiest approach starting from the results in your sample file. Will that work for you?

+ Reply to Thread

Thread Information

Users Browsing this Thread

There are currently 1 users browsing this thread. (0 members and 1 guests)

Similar Threads

  1. [SOLVED] Logistic Regression/Regression to predict if Item will sell or not based on ratings
    By chandramouliarun in forum Excel Formulas & Functions
    Replies: 3
    Last Post: 11-04-2019, 10:01 AM
  2. automative Regression, Macro, update regression with new values
    By #läuftbeimir?! in forum Excel General
    Replies: 6
    Last Post: 12-17-2018, 04:49 PM
  3. Replies: 0
    Last Post: 10-14-2018, 08:38 PM
  4. IF Failure ?
    By lbdyck in forum Excel Programming / VBA / Macros
    Replies: 5
    Last Post: 08-07-2018, 09:01 PM
  5. Do...Loop failure!
    By foxtrotdelta in forum Excel Programming / VBA / Macros
    Replies: 2
    Last Post: 11-05-2008, 09:17 AM

Bookmarks

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts

Search Engine Friendly URLs by vBSEO 3.6.0 RC 1