Tim Barry: Gamma, Poisson, and negative binomial distributions

Tim Barry

The gamma, Poisson, and negative binomial distributions frequently are used to model RNAseq and single cell RNAseq data. Here, we review the statistical properties of these distributions. We postpone discussion of modeling RNAseq data to a subsequent post.

Gamma

The gamma distribution is a non-negative, continuous, two-parameter probability distribution. There are two common parameterizations of the gamma distribution: the “shape-scale” parameterization and the “shape-rate” parameterization. The pdf of the gamma distribution under the former parameterization is \[f(x;k,\theta) = \frac{1}{\Gamma(k)\theta^k} x^{k-1} e^{-\frac{x}{\theta}},\] where \(k > 0\) is the shape parameter, \(\theta > 0\) is the scale parameter, and \(\Gamma\) is the gamma function. Recall that, for positive real numbers, the gamma function is defined by \[ \Gamma(x) = \int_{0}^\infty t^{-x}e^{-t} dt.\] The Gamma distribution has support \((0,\infty)\). The mean and variance of the Gamma distribution (under shape-scale parameterization) are \(k\theta\) and \(k\theta^2\). The alternate parameterization of the gamma distribution — the shape-rate parameterization — sets \(\alpha = k\) and \(\beta = 1/\theta\). The parameter \(\alpha\) retains the name of the shape parameter \(k\), and \(\beta\) is called the scale parameter.

Poisson

The Poisson distribution is a discrete probability distribution used to model (non-negative) count data. The pmf of the Poisson distribution is \[ p(x; \lambda) = \frac{\lambda^x e^{-\lambda}}{x!},\] where \(\lambda > 0\) is called the rate parameter. The support of the distribution is \(\mathbb{Z}^{\geq 0}\), and the mean and variance are \(\lambda\). The Poisson and Gamma distributions are members of the exponential family, and so parameter estimation (through, e.g., MLE) is simple in both models.

Negative binomial

The negative binomial (NB) distribution is a discrete probability distribution that takes support on the non-negative integers. The NB distribution models the number of failures in a sequence of independent trials before a specified number of successes occurs. For example, suppose we flip a coin repeatedly until we see 10 heads. The total number of tails we see throughout the experiment follows an NB distribution. The pmf of the NB distribution (under standard parameterization) is \[ p(x; r, p) = \binom{x + r - 1}{x} (1 - p)^r p^x, \] where \(r \in \mathbb{Z}^{\geq 0}\) is the number of successes, and \(p \in [0,1]\) is the probability of failure. The mean and variance of the NB distribution are \[\frac{pr}{1-p}\] and \[\frac{1 + p}{(1-p)^2}.\] The pmf of the NB distribution can be generalized to allow for \(r \in \mathbb{R}^{\geq 0}\) as follows: \[ p(x; r, p) = \frac{\Gamma(x + r)}{x! \Gamma(r)}(1-p)^rp^k.\] This extension follows from properties of the gamma function.

If one holds \(r\) fixed, the NB distribution is a member of the exponential family. However, when both \(r\) and \(p\) are allowed to vary, the NB distribution is not member of the exponential family. This means that parameter estimation in the two-parameter NB model is a bit more challenging than parameter estimation in many other common parameteric models. In particular, the NB maximum likelihood estimate fails to exist when the sample mean exceeds the sample seond moment (Aragon, Eberly, and Eberly 1992) .

Gamma-Poisson mixture

A very cool (and useful) fact about the negative binomial distribution is that it can be written as a mixture of gamma and Poisson distributions. Consider a Poisson model with gamma-distributed mean: \[ \begin{cases} Y \sim \textrm{Pois}(\theta) \\ \theta \sim \textrm{gamma}(r, \frac{p}{1-p}), \end{cases} \] where the gamma distribution is expressed in shape-scale parameterization. One can show that \(Y\) is negative binomially distributed with parameters \(r\) and \(p\), i.e. \[ Y \sim \textrm{NB}(r,p).\] A quick proof (taken from this blog post) is as follows: \[ \begin{align*} p(y; r, p) = \int_{0}^{\infty} p(y | \theta) p(\theta) d \theta \textrm{ (law of total probability)} \\ = \int_{0}^{\infty} \left( \frac{\theta^y e^{-\theta}}{y!} \right) \left( \frac{ \theta^{r-1} e^{-\theta (1 - p)/p} }{ \Gamma(r) \left(\frac{p}{1-p}\right)^r}\right) d\theta \textrm{ (plug in pdfs)} \\ = \frac{\Gamma(y + r)}{y! \Gamma(r)}(1-p)^r p^y \textrm{ (compute integral)} \\ \sim NB(r,p). \end{align*} \]

Parameterizations of the negative binomial distribution

There are several parameterizations of the negative binomial distribution. We list three such parameterizations in the table below. The left column shows the (shape-scale) parameterization of the underlying gamma distribution. The middle and right columns show the parameterization and pmf of the corresponding NB distribution.

Idx	Gamma parameterization	NB parameterization	NB pmf
1.	Gamma\(\left(r, \frac{p}{1 - p}\right)\)	NB\(\left(r,p\right)\)	\[\binom{x + r - 1}{x} p^x (1-p)^r \]
2.	Gamma\(\left(r, \frac{p}{r} \right)\)	NB\(\left(r, \frac{p}{p+r}\right)\)	\[ \binom{x + r - 1}{x} \left( \frac{p}{p+r} \right)^x \left( \frac{r}{p+r} \right)^r \]
3.	Gamma\(\left(r, p \right)\)	NB\(\left(r, \frac{p}{p+1}\right)\)	\[ \binom{x + r - 1}{x} \left( \frac{p}{p+1} \right)^x\left( \frac{1}{p+1} \right)^r \]

Of these three NB parameterizations, the most common are the first and second. The first is similar (but not identical) to the default used by the various NB functions (dnbinom, rnbinom, etc.) in base R (see documentation). The second is the default used by the function negbinomial in the popular R package VGAM. Note that in the second and third parameterizations, \(p\) takes values in \(\mathbb{R}^{\geq 0}\) and therefore cannot be interpreted as a probability.

NB, Poisson, and overdispersion

Consider the second parameterization of the NB distribution in the table above. Its mean is \[ \frac{p/(p+r)r}{1 -p/(p+r)} = p\] and its variance is \[ \frac{ p/(p+r) r }{ (1 - p/(p+r))^2 } = p + \frac{p^2}{r}.\] Thus, for \(r < \infty\), the variance of the negative binomial distribution exceeds the mean. This is an important difference between the Poisson and NB distributions — in the Poisson distribution, the mean and variance coincide. One can show that the NB distribution converges to the Poisson distribution as \(r\) tends to infinity. Informally, \[\lim_{r \to \infty} \textrm{NB}\left( r, \frac{\lambda}{r + \lambda}\right) = \textrm{Pois}(\lambda).\] Thus, for large \(r\), the NB and Poisson distributions are approximately equivalent. For small \(r\), the NB distribution is a sort of “overdispersed” Poisson distribution. This property of the NB distribution makes it an appealing candidate for modeling highly variable gene expression data.

Aragon, Jorge, David Eberly, and Shelly Eberly. 1992. “Existence and uniqueness of the maximum likelihood estimator for the two-parameter negative binomial distribution.” Statistics and Probability Letters 15 (5): 375–79. https://doi.org/10.1016/0167-7152(92)90157-Z.

Gamma, Poisson, and negative binomial distributions