Article ID: 828528  View products that this article applies to. On This PageSUMMARYThis article describes the LOGEST function in Microsoft
Office Excel 2003 and in later versions of Excel. It illustrates how the function is used, and it compares the
results of the function in Excel 2003 and in later versions of Excel with the results of the function in
earlier versions of Excel. LOGEST is evaluated by calling the related function, LINEST. Extensive changes to LINEST for Excel 2003 and for later versions of Excel are summarized, and their implications for LOGEST are noted. Microsoft Excel 2004 for Macintosh InformationThe statistical functions in Microsoft Excel 2004 for Macintosh were updated by using the same algorithms as Excel 2003 and later versions of Excel. Any information in this article that describes how a function works or that describes how a function was modified for Excel 2003 and for later versions of Excel also applies to Excel 2004 for Macintosh.MORE INFORMATIONThe LOGEST(known_y's, known_x's, constant, statistics)
function is used to perform regression where an exponential curve is fitted. A
least squares criterion is used, and LOGEST tries to find the best fit under
that criterion. Known_y's represent data on the dependent variable, and
known_x's represents data on one or more independent variables. The second
argument is optional. If it is omitted, it is assumed to be an array of the
same size as known_y's that contain the values (1, 2, 3,
...). Assuming that there are p predictor variables, LOGEST fits an equation of the following form:
The last argument to LOGEST is set to TRUE if you want additional statistics, including various sums of squares, rsquared, fstatistic, and standard errors of the regression coefficients. In this case, LOGEST must be entered as an array formula. The last argument is optional. If it is omitted, it is interpreted as FALSE. The dimensions of the array are five rows by a number of columns equal to the number of independent variables plus one if the third argument is set to TRUE (plus 0 if the third argument is not set to TRUE). If the third argument "constant" is set to TRUE, you want the regression model to include the constant b in the previous equation in its model. If it is set to FALSE, the multiplicative constant b is excluded from the regression model by forcing it equal to one. The third argument is optional. If it is omitted, it is interpreted as TRUE. In this article, assume that data is arranged in columns so that known_y's is a column of y data and known_x's is one or more columns of x data. The dimensions or lengths of each of these columns must be equal. All the following observations are equally true if the data is not arranged in columns, but it is easier to discuss the most frequently used case. Also, assume that the last argument to LOGEST is always TRUE and that you are always interested in detailed output. This supposition helps to reveal numeric problems in earlier versions of Excel. Some numeric problems are still present if you use FALSE for this argument. This article uses the following examples to show how LOGEST relates to LINEST and to point out problems with LINEST in earlier versions of Excel that translate into problems with LOGEST. While the code for LOGEST has not been rewritten for Excel 2003 and for later versions of Excel, extensive changes and improvements in the code for LINEST have been made. LOGEST effectively calls LINEST, executes LINEST, modifies LINEST output, and presents it to you. Therefore, you should know about problems in the execution of LINEST. For additional information about LINEST, click the following article number to view the article in the Microsoft Knowledge Base: 828533 The LINEST Help file has also been revised for Excel 2003 and for later versions of Excel, and is useful for more information.Because the focus of this article is on numeric
problems in earlier versions of Microsoft Excel, this article does not include many
practical examples of LOGEST. The LOGEST Help file contains two useful
examples.
(http://support.microsoft.com/kb/828533/
)
Description of the LINEST function in Excel 2003 and in Excel 2004 for Mac
Syntax
The most common use of LOGEST includes two ranges of cells that contain the data, such as LOGEST(A1:A100, B1:F100, TRUE, TRUE). Because there is typically more than one predictor variable, the second argument in this example contains multiple columns. In this example, there are 100 subjects, one dependent variable value, known_y's, for each subject, and five dependent variable values, known_x's, for each subject. Example of usageTwo separate Excel worksheet examples are provided to illustrate the following key concepts:
828533 To illustrate LOGEST with the third argument set to
FALSE, follow these steps:
(http://support.microsoft.com/kb/828533/
)
Description of the LINEST function in Excel 2003 and in Excel 2004 for Mac
Collapse this table
In Excel 2002 and in earlier versions of Excel, when LINEST is called with its third argument set to FALSE, it always computes a regression sum of squares that is not correct because it uses a formula that is not correct. This problem has been corrected in Excel 2003 and in later versions of Excel. Notice that the values in the first two rows of the output table are not affected by this problem. In Excel 2002 and in earlier versions of Excel, the LINEST article describes a workaround to generate the appropriate values in the last three rows of the LINEST output table. If you are using an earlier version of Excel and you want to set the third argument to LOGEST to FALSE, we recommend that you explicitly execute steps 2 and 3 of the previous procedure, and then use the workaround in the LINEST article to modify the last three rows of the LINEST output table. Problems occur because of collinear known_x's in LOGEST or LINEST in Excel 2002 and in earlier versions of Excel. Predictor columns, known_x's are collinear if at least one column, c, can be expressed as a sum of multiples of others (c1, c2, and other values). Column c is called redundant because the information that it contains can be constructed from the columns (c1, c2, and other values). The fundamental principle in the presence of collinearity is that results should not be affected by whether a redundant column is included in the original data or removed from the original data. Because the version of LINEST in Excel 2002 and in earlier versions of Excel did not look for collinearity, this principle was easily violated. Predictor columns are nearly collinear if at least one column, c, can be expressed as almost equal to a sum of multiples of others (c1, c2, and other values). In this case, "almost equal" means a very small sum of squared deviations of entries in c from corresponding entries in the weighted sum of c1, c2, and other values. For example, "very small" might be less than 10^(12). To illustrate LOGEST collinearity, follow these steps:
Collapse this table
The first model, in rows 8 to 13, uses columns B and C as predictors and requests Excel to model the constant where the third argument set to TRUE. Excel then effectively inserts an additional predictor column that looks just like cells D2:D6. Notice that entries in column C in rows 2 through 6 are exactly equal to the sum of corresponding entries in columns B and D. Therefore, collinearity occurs because column C is a sum of multiples of column B, and the Excel additional column of 1 is inserted because the third argument to LOGEST was omitted or TRUE that is the "normal" case. This collinearity causes numeric problems, and Excel 2002 and earlier versions of Excel cannot compute results and the LOGEST output table is filled with #NUM!. Any version of Excel can handle the second model in rows 15 through 20. Collinearity does not occur with this model, and the user requests Excel to model the constant. This example is included here for the following two reasons:
In the second model in rows 16 to 20, collinearity does not occur, and none of the columns are removed. The predicted y values are the same in both models because removing a redundant column that is a sum of multiples of others (the first and second models) does not reduce the goodness of fit of the resulting model. Such columns are removed precisely because they represent no value added in trying to find the best least squares fit. Also, in the output of Excel 2003 and of later versions of Excel in cells I8:K20, the last three rows of the output tables are the same, and the entries in cells I16:J17 and cells J9:K10 coincide. This demonstrates that the same results are obtained when column C is included in the model but found to be redundant (output in I9:K13) as when column C was eliminated before LOGEST was run (output in I16:J20). This output satisfies the fundamental principle in the presence of collinearity. Collinearity is identified in LINEST in Excel 2003 and in later versions of Excel by using a completely different approach, QR Decomposition, to solve for the regression coefficients. The LINEST article describes a walkthrough of the QR Decomposition algorithm for a small example. Summary of results in earlier versions of ExcelLOGEST results are adversely affected in Excel 2002 and in earlier versions of Excel by results in LINEST that are not accurate.LINEST used a formula that is not correct for the total sum of squares when the third argument in LINEST is set to FALSE. This formula resulted in values of the regression sum of squares that are not correct. Also, the values that depend on the regression sum of squares, r squared and the f statistic, are not correct. (See the workaround in the LINEST article if you are using an earlier version of Excel.) Therefore, users of LOGEST must use this workaround when they call LOGEST with the third argument set to FALSE. Regardless of the value of the third argument, LINEST was calculated by using an approach that did not address collinearity issues. Collinearity caused round off errors, standard errors of regression coefficients that were not appropriate, and degrees of freedom that were not appropriate. In some cases, round off errors were sufficiently severe that the LINEST output table was filled with #NUM!. LINEST generally provides acceptable results if the following conditions are true:
Summary of results in Excel 2003 and in later versions of ExcelThe following improvements in LINEST have been made:
ConclusionsThe performance of LOGEST has been improved because LINEST has been greatly improved in Excel 2003 and in later versions of Excel. If you use an earlier version of Excel, verify that the predictor columns are not collinear before you use LOGEST. Also, be careful to use the workaround that is presented in the LINEST article when the third argument to LOGEST is set to FALSE. Although this information in this article and in the LINEST article may seem alarming to users of Excel 2002 and of earlier versions of Excel, collinearity is a problem in a small percentage of cases. Calls to LOGEST with the third argument set to FALSE are probably also relatively rare in practice. Earlier versions of Excel give acceptable LOGEST results when there is no collinearity and when LOGEST's third argument is TRUE or omitted.Note The improvements in LINEST also affect the Analysis ToolPak's linear regression tool (which calls LINEST) and two other related Excel functions: TREND and GROWTH. PropertiesArticle ID: 828528  Last Review: January 13, 2007  Revision: 4.2
