Article ID: 827459 - View products that this article applies to.
This article describes the BINOMDIST function in Microsoft Office Excel 2003 and in later versions of Excel, illustrates how to use the function, and compares the results of the function for Excel 2003 and for later versions of Excel with its results for earlier versions of Excel.
Microsoft Excel 2004 for Mac informationThe statistical functions in Excel 2004 for Mac were updated by using the same algorithms that were used to update the statistical functions in Excel 2003 and in later versions of Excel. Any information in this article that describes how a function works or how a function was modified for Excel 2003 and for later versions of Excel also applies to Excel 2004 for Mac.
When cumulative = TRUE, the BINOMDIST(x, n, p, cumulative) function returns the probability of x or fewer successes in n independent Bernoulli trials. Each of the trials has an associated probability p of success (and probability 1-p of failure). When cumulative = FALSE, BINOMDIST returns the probability of exactly x successes.
Example of usageMake the following assumptions:
If the batter gets 50 hits in his first 200 trials (a .250 average), he must get 100 hits in his next 300 trials to have 150 hits and a .300 average over 500 trials. You can use the following table to analyze the chance that the batter gets sufficient hits to maintain his average. Baseball commentators frequently allude to the "law of averages" when they say that fans do not have to worry about the performance of this batter with only 50 hits in his first 200 trials because "by the end of the season his average will be .300." If the trials really were independent, and the batter really had a 0.3 chance of success on any one trial, this reasoning is fallacious because the outcomes of the first 200 trials do not affect the success or the failure over the last 300 trials.
To illustrate the use of BINOMDIST, create a blank Excel worksheet, copy the following table, select cell A1 in your blank Excel worksheet, and then paste the entries so that the following table fills cells A1:C22 in your worksheet.
Note After you paste this table into your new Excel worksheet, click the Paste Options button, and then click Match Destination Formatting. With the pasted range still selected, use one of the following procedures, as appropriate for the version of Excel that you are running:
Collapse this tableExpand this table
Cells B4:B14 show the probabilities of exactly x successes in 10 trials. The most likely number of successes is 3. The chances of 0, 6, 7, 8, 9, or 10 successes are each less than 0.05 and add to about 0.076. So the chances of 1, 2, 3, 4, or 5 successes is about 1 – 0.076 = 0.924. Cells C4:C14 show the probabilities of x or fewer successes in 10 trials. You can verify that the entries in column C in any row are each equal to the sum of all the entries in column B, down to and including that row.
B18:B20 show that the most likely number of successes in 300 trials is 90. The probability of exactly x successes increases as x increases to 90, and then decreases as x continues to increase higher than 90. The chance of 90 or fewer successes is just over 50%, as C20 shows. The chance of 99 or fewer successes is about 0.884. Therefore, there is only an 11.6% chance (0.116 = 1 – 0.884) of 100 or more successes.
Results in earlier versions of ExcelKnusel (see note 1) documented instances where BINOMDIST does not return a numeric answer and yields #NUM! instead because of a numeric overflow. When BINOMDIST returns numeric answers, they are correct. BINOMDIST returns #NUM! only when the number of trials is greater than or equal to 1030. There are no computational problems if n < 1030. In practice, such high values of n are unlikely. With such a high number of independent trials, a user may want to approximate the Binomial distribution by a normal distribution (if n*p and n*(1-p) are sufficiently high, for example, each is greater than 30) or by a Poisson distribution otherwise.
Note 1 Knusel, L. "On the Accuracy of Statistical Distributions in Microsoft Excel 97", Computational Statistics and Data Analysis (1998), 26: 375-377.
For the non-cumulative case, BINOMDIST(x, n, p, false) uses the following formula
COMBIN is an Excel function that gives the number of combinations of x items in a population of n items. COMBIN(n,x) is sometimes written nCx, and named a "combinatorial coefficient" or just, "n choose x". If you experiment with COMBIN by typing =COMBIN(1029,515) in one cell and =COMBIN(1030,515) in a different cell, the first cell returns an astronomical number, 1.4298E+308, and the second cell returns #NUM! because it is even larger. The overflow of COMBIN causes an overflow of BINOMDIST in earlier versions of Excel.
COMBIN has not been modified for Excel 2003 and for later versions of Excel.
Results in Excel 2003 and in later versions of ExcelBecause Microsoft has diagnosed when an overflow causes BINOMDIST to return #NUM! and knows that BINOMDIST is well-behaved when overflow does not occur, Microsoft has implemented a conditional algorithm in Excel 2003 and in later versions of Excel.
The algorithm uses BINOMDIST code from earlier versions of Excel (the computational formula mentioned earlier in this article) when n < 1030. When n >= 1030, Excel 2003 and later versions of Excel use the alternative algorithm that is described later in this article.
Typically, COMBIN overflows because it is astronomical, but p^x and (1-p)^(n-x) are each infinitesimal. If it were possible to multiply them together, the product would be a realistic probability between 0 and 1. However, because existing finite arithmetic cannot multiply them, an alternative algorithm avoids the evaluation of COMBIN.
Microsoft's approach calculates an unscaled sum of all the probabilities of exactly x successes that are used later for scaling purposes. It also calculates an unscaled value of the probability that you want BINOMDIST to return. Finally, it uses the scaling factor to return a correct BINOMDIST value.
The algorithm takes advantage of the fact that the ratio of successive terms of the form COMBIN(n,k)*(p^k)*((1-p)^(n-k)) has a simple form. The algorithm proceeds as described in the pseudocode in the following steps.
Step 0: (Initialization). Initialize the TotalUnscaledProbability and the UnscaledResult properties to 0. Initialize the constant EssentiallyZero to a very small number, for example, 10^(-12).
Step 1: Find n*p and round down to the nearest whole number, m. The most likely number of successes in n trials is either m or m+1. COMBIN(n,k)*(p^k)*((1-p)^(n-k)) decreases as k decreases from m to m-1 to m-2, and so on. Also, COMBIN(n,k)*(p^k)*((1-p)^(n-k)) decreases as k increases from m+1 to m+2 to m+3, and so on.
Step 2: Calculate the unscaled probabilities for k > m:
Step 3: Calculate the unscaled probabilities for k < m:
Step 4: Combine the unscaled results:
Although this method is used only for n >= 1030, you can use the following additions to the Excel worksheet to help you hand-execute this algorithm to calculate BINOMDIST(3, 10, 0.3, TRUE) (in the baseball example, the chance of 3 or fewer hits in 10 trials for a .300 batter).
To illustrate this, copy the following table, select cell D4 in the Excel worksheet that you created earlier, and then paste the entries so that the following table fills cells D1:E15 in your worksheet.
Column D contains the unscaled probabilities. The 1 in cell D6 is the result of Step 1 of the algorithm. Excel 2003 and later versions of Excel calculate the entries in cells D7, D8, ..., D14 (in that order) in Step 2. Excel calculates the entries in cells D5 and D4 (in that order) in Step 3. The sum of all unscaled probabilities appears in D15.
Collapse this tableExpand this table
To calculate the probability of 3 or fewer successes, type the following formula in any blank cell:
= SUM(D4:D7)/D15In the previous example, EssentiallyZero does not stop Steps 2 or 3. However, if you want to evaluate BINOMDIST(550, 2000, 0.3, TRUE), EssentiallyZero may stop Step 2 or Step 3. A binomial random variable with n = 2000 and p = 0.3 has a distribution that is approximated by the normal with mean 600 and standard deviation SQRT(2000*0.3*(1 – 0.3)) = SQRT(420) = 20.5. Then 805 is 10 standard deviations higher than the mean and 395 is 10 standard deviations lower than the mean. Depending on your setting of EssentiallyZero, EssentiallyZero may stop Step 2 before you reach 805 and may stop Step 3 before you reach 395.
ConclusionsInaccuracies in versions of Excel that are earlier than Excel 2003 occur only when the number of trials is greater than or equal to 1030. In such cases, BINOMDIST returns #NUM! in earlier versions of Excel because one term overflows in a sequence of terms that are multiplied together. To correct this behavior, Excel 2003 and later versions of Excel use the alternative procedure that is mentioned earlier in this article when such an overflow would otherwise occur.
The CRITBINOM, HYPGEOMDIST, NEGBINOMDIST, and POISSON function exhibit similar behavior in earlier versions of Excel. These functions also return either correct numeric results or #NUM! or #DIV/0!. Again, problems occur because of overflow (or underflow).
It is easy to determine when and how these problems occur. Excel 2003 and later versions of Excel use an alternative algorithm that is similar to the one for BINOMDIST to return correct answers in cases where earlier versions of Excel return #NUM!.
Article ID: 827459 - Last Review: September 19, 2011 - Revision: 5.0