Negative binomial distribution
From Wikipedia, the free encyclopedia
Probability mass function The red line represents the mean, and the green line has an approximate length of 2σ. |
|
Cumulative distribution function |
|
Parameters | (real) (real) |
---|---|
Support | |
Probability mass function (pmf) | |
Cumulative distribution function (cdf) | Ip(r,k + 1) where Ip(x,y) is the regularized incomplete beta function |
Mean | |
Median | |
Mode | |
Variance | |
Skewness | |
Excess kurtosis | |
Entropy | |
Moment-generating function (mgf) | |
Characteristic function |
In probability and statistics the negative binomial distribution is a discrete probability distribution. It can be used to describe the distribution arising from an experiment consisting of a sequence of independent trials, subject to several constraints. Firstly each trial results in success or failure, the probability of success for each trial, p, is constant across the experiment and finally the experiment continues until a fixed number of successes have been achieved.
The Pascal distribution and the Polya distribution are special cases of the negative binomial. There is a convention among engineers, climatologists, and others to reserve "negative binomial" in a strict sense or "Pascal" (after Blaise Pascal) for the case of an integer-valued parameter r, and use "Polya" (for George Pólya) for the real-valued case, to the right. The Polya distribution more accurately models occurrences of "contagious" discrete events, like tornado outbreaks, than does the Poisson distribution.
Contents |
[edit] Specification of the negative binomial distribution
[edit] Probability mass function
The family of negative binomial distributions is a two-parameter family; several parametrizations are in common use. One very common parameterization employs two real-valued parameters p and r with 0 < p < 1 and r > 0. Under this parameterization, the probability mass function of a random variable with a NegBin(r, p) distribution takes the following form:
for k = 0,1,2,...
where
and
- Γ(r) = (r − 1)!
[edit] Limiting case
Under an alternative parameterization
the mass function becomes
where λ and r are nonnegative real parameters. Under this parameterization, we have
which is the mass function of a Poisson-distributed random variable with Poisson rate λ. In other words, the alternatively parameterized negative binomial distribution converges to the Poisson distribution and r controls the deviation from the Poisson. This makes the negative binomial distribution suitable as a robust alternative to the Poisson, which approaches the Poisson for large r, but which has larger variance than the Poisson for small r.
[edit] Gamma-Poisson mixture
Third, the negative binomial distribution arises as a continuous mixture of Poisson distributions where the mixing distribution of the Poisson rate is a gamma distribution. Formally, this means that the mass function of the negative binomial distribution can also be written as
Because of this, the negative binomial distribution is also known as the gamma-Poisson (mixture) distribution.
[edit] Cumulative distribution function
The cumulative distribution function can be expressed in terms of the regularized incomplete beta function:
[edit] Occurrence
[edit] Waiting time in a Bernoulli process
For the special case where r is an integer, the negative binomial distribution is known as the Pascal distribution. It is the probability distribution of a certain number of failures and successes in a series of independent and identically distributed Bernoulli trials. For k+r Bernoulli trials with success probability p, the negative binomial gives the probability of k failures and r successes, with success on the last trial. In other words, the negative binomial distribution is the probability distribution of the number of failures before the r 'th success in a Bernoulli process, with probability p of success on each trial. A Bernoulli process is a discrete time process, and so the number of trials, failures, and successes are integers.
Consider the following example. Suppose we repeatedly throw a die, and consider a "1" to be a "success". The probability of success on each trial is 1/6. The number of trials needed to get three successes belongs to the infinite set { 3, 4, 5, 6, ... }. That number of trials is a (displaced) negative-binomially distributed random variable. The number of failures before the third success belongs to the infinite set { 0, 1, 2, 3, ... }. That number of failures is also a negative-binomially distributed random variable.
When r = 1 we get the probability distribution of failures before the first success (i.e. the probability of success on the (k+1)th trial), which is a geometric distribution:
[edit] Overdispersed Poisson
The negative binomial distribution, especially in its alternative parameterization described above, can be used as an alternative to the Poisson distribution. It is especially useful for discrete data over an unbounded positive range whose sample variance exceeds the sample mean. If a Poisson distribution is used to model such data, the model mean and variance are equal. In that case, the observations are overdispersed with respect to the Poisson model. Since the negative binomial distribution has one more parameter than the Poisson, the second parameter can be used to adjust the variance independently of the mean. See Cumulant#Cumulants of some discrete probability distributions.
[edit] Related distributions
- The geometric distribution is a special case of the negative binomial distribution, with
- The negative binomial distribution converges to the Poisson distribution in the following sense:
- The negative binomial distribution is a special case of the discrete phase-type distribution.
[edit] Properties
[edit] Relation to other distributions
If Xr is a random variable following the negative binomial distribution with parameters r and p, then Xr is a sum of r independent variables following the geometric distribution with parameter p. As a result of the central limit theorem, Xr is therefore approximately normal for sufficiently large r.
Furthermore, if Ys+r is a random variable following the binomial distribution with parameters s + r and p, then
In this sense, the negative binomial distribution is the "inverse" of the binomial distribution.
The sum of independent negative-binomially distributed random variables with the same value of the parameter p but the "r-values" r1 and r2 is negative-binomially distributed with the same p but with "r-value" r1 + r2.
The negative binomial distribution is infinitely divisible, i.e., if X has a negative binomial distribution, then for any positive integer n, there exist independent identically distributed random variables X1, ..., Xn whose sum has the same distribution that X has. These will not be negative-binomially distributed in the sense defined above unless n is a divisor of r (more on this below).
[edit] Sampling and point estimation of p
Suppose p is unknown and an experiment is conducted where it is decided ahead of time that sampling will continue until r successes are found. The sufficient statistics for the experiment is k, the number of failures.
In estimating p, the minimum variance unbiased point estimator is . One might think the estimator is , but this is biased. Haldane Article
[edit] Relation to the binomial theorem
Suppose K is a random variable with a negative binomial distribution with parameters r and p. The statement that the sum from k = 0 to infinity, of the probability Pr[K = k], is equal to 1, can be shown by a bit of algebra to be equivalent to the statement that (1 − p)− r is what Newton's binomial theorem says it should be.
Suppose Y is a random variable with a binomial distribution with parameters n and p. The statement that the sum from y = 0 to n, of the probability Pr[Y = y], is equal to 1, says that 1 = (p + (1 − p))n is what the strictly finitary binomial theorem of rudimentary algebra says it should be.
Thus the negative binomial distribution bears the same relationship to the negative-integer-exponent case of the binomial theorem that the binomial distribution bears to the positive-integer-exponent case.
Assume p + q = 1. Then the binomial theorem of elementary algebra implies that
This can be written in a way that may at first appear to some to be incorrect, and perhaps perverse even if correct:
in which the upper bound of summation is infinite. The binomial coefficient
is defined even when n is negative or is not an integer. But in our case of the binomial distribution it is zero when k > n. So why would we write the result in that form, with a seemingly needless sum of infinitely many zeros? The answer comes when we generalize the binomial theorem of elementary algebra to Newton's binomial theorem. Then we can say, for example
Now suppose r > 0 and we use a negative exponent:
Then all of the terms are positive, and the term
is just the probability that the number of failures before the rth success is equal to k, provided r is an integer. (If r is a negative non-integer, so that the exponent is a positive non-integer, then some of the terms in the sum above are negative, so we do not have a probability distribution on the set of all nonnegative integers.)
Now we also allow non-integer values of r. Then we have a proper negative binomial distribution, which is a generalization of the Pascal distribution, which coincides with the Pascal distribution when r happens to be a positive integer.
Recall from above that
- The sum of independent negative-binomially distributed random variables with the same value of the parameter p but the "r-values" r1 and r2 is negative-binomially distributed with the same p but with "r-value" r1 + r2.
This property persists when the definition is thus generalized, and affords a quick way to see that the negative binomial distribution is infinitely divisible.
[edit] Examples
(After a problem by Dr. Diane Evans, professor of mathematics at Rose-Hulman Institute of Technology)
Pat is required to sell candy bars to raise money for the 6th grade field trip. There are thirty houses in the neighborhood, and Pat is not supposed to return home until five candy bars have been sold. So the child goes door to door, selling candy bars. At each house, there is a 0.4 probability of selling one candy bar and a 0.6 probability of selling nothing.
What's the probability mass function for selling the last candy bar at the nth house?
Recall that the NegBin(r, p) distribution describes the probability of k failures and r successes in k+r Bernoulli(p) trials with success on the last trial. Selling five candy bars means getting five successes. The number of trials (i.e. houses) this takes is therefore k+5 = n. The random variable we are interested in is the number of houses, so we substitute k = n − 5 into a NegBin(5, 0.4) mass function and obtain the following mass function of the distribution of houses (for n ≥ 5):
What's the probability that Pat finishes on the tenth house?
What's the probability that Pat finishes on or before reaching the eighth house?
To finish on or before the eighth house, Pat must finish at the fifth, sixth, seventh, or eighth house. Sum those probabilities:
What's the probability that Pat exhausts all 30 houses in the neighborhood?
[edit] See also
[edit] References
- Hilbe, Joseph M., Negative Binomial Regression, Cambridge, UK: Cambridge University Press (2007)[1]