CONTENTS

Moment Generating Functions

Motivation: why this matters in quant finance

The moment generating function packages a distribution's moments into one expectation:

MX(s)=E[esX].M_X(s)=\mathbb{E}[e^{sX}].

In Black-Scholes calculations, this is not decorative. The identity

E[eσWT]=eσ2T/2\mathbb{E}[e^{\sigma W_T}]=e^{\sigma^2T/2}

is an MGF evaluation for a normal random variable. It is the exponential-moment calculation behind lognormal prices, convexity corrections, and the σ2/2-\sigma^2/2 drift adjustment in geometric Brownian motion.

Bertsekas presents transforms as alternative representations of probability laws: not especially intuitive at first, but powerful for moments, distribution identification, and sums of independent variables. This lesson follows that route. The MGF is useful precisely when exponential moments exist; when they do not, the nearby characteristic functions lesson takes over.

The informal idea

Expand the exponential:

esX=1+sX+s2X22!+s3X33!+.e^{sX}=1+sX+\frac{s^2X^2}{2!}+\frac{s^3X^3}{3!}+\cdots.

Taking expectations gives

MX(s)=1+sE[X]+s22!E[X2]+s33!E[X3]+.M_X(s)=1+s\mathbb{E}[X]+\frac{s^2}{2!}\mathbb{E}[X^2]+\frac{s^3}{3!}\mathbb{E}[X^3]+\cdots.

So the MGF is a generating function for moments. Differentiating at zero reads the moments back out. The catch is existence: E[esX]\mathbb{E}[e^{sX}] may be infinite away from zero.

Formal definition

The moment generating function of a random variable XX is
MX(s)=E[esX],M_X(s)=\mathbb{E}[e^{sX}],

for all real ss where the expectation is finite. Its domain is

DX={sR:MX(s)<}.D_X=\{s\in\mathbb{R}:M_X(s)<\infty\}.

If DXD_X contains an open interval around 00, the MGF is especially useful: it determines the distribution and its derivatives at zero give the moments.

For a discrete random variable,

MX(s)=xesxpX(x).M_X(s)=\sum_x e^{sx}p_X(x).

For a continuous random variable with density,

MX(s)=esxfX(x)dx.M_X(s)=\int_{-\infty}^{\infty} e^{sx}f_X(x)\,dx.

Key properties

Moment reading

If the MGF exists in a neighbourhood of zero, then

MX(k)(0)=E[Xk].M_X^{(k)}(0)=\mathbb{E}[X^k].

In particular,

E[X]=MX(0),Var(X)=MX(0)(MX(0))2.\mathbb{E}[X]=M_X'(0), \qquad \text{Var}(X)=M_X''(0)-\left(M_X'(0)\right)^2.

Affine transformations

If Y=aX+bY=aX+b, then

MY(s)=esbMX(as).M_Y(s)=e^{sb}M_X(as).

Bertsekas uses this to derive the MGF of a general normal from the standard normal.

Independent sums

If XX and YY are independent, then

MX+Y(s)=MX(s)MY(s).M_{X+Y}(s)=M_X(s)M_Y(s).

This is why MGFs make sums of independent Poisson, binomial, and normal variables easy.

Uniqueness

If MX(s)=MY(s)M_X(s)=M_Y(s) on an open interval around zero, then XX and YY have the same distribution.

Cumulants

The log-MGF

KX(s)=logMX(s)K_X(s)=\log M_X(s)

is the cumulant generating function. Independent sums add cumulants because MGFs multiply and logarithms turn products into sums.

Worked examples

Example 1: normal MGF

For XN(μ,σ2)X\sim\mathcal{N}(\mu,\sigma^2),

MX(s)=exp(μs+12σ2s2).M_X(s)=\exp\left(\mu s+\frac12\sigma^2s^2\right).

Then MX(0)=μM_X'(0)=\mu and MX(0)=μ2+σ2M_X''(0)=\mu^2+\sigma^2, so Var(X)=σ2\text{Var}(X)=\sigma^2.

Example 2: Black-Scholes exponential moment

If WTN(0,T)W_T\sim\mathcal{N}(0,T), then

E[eσWT]=MWT(σ)=eσ2T/2.\mathbb{E}[e^{\sigma W_T}]=M_{W_T}(\sigma)=e^{\sigma^2T/2}.

Thus

E[S0e(μσ2/2)T+σWT]=S0e(μσ2/2)Teσ2T/2=S0eμT.\mathbb{E}\left[S_0e^{(\mu-\sigma^2/2)T+\sigma W_T}\right] =S_0e^{(\mu-\sigma^2/2)T}e^{\sigma^2T/2}=S_0e^{\mu T}.

The drift correction is not a convention; it is the MGF calculation that keeps the expected stock price at S0eμTS_0e^{\mu T}.

Example 3: sum of independent Poissons

For XPoisson(λ)X\sim\text{Poisson}(\lambda),

MX(s)=exp(λ(es1)).M_X(s)=\exp(\lambda(e^s-1)).

If XX and YY are independent Poisson variables with means λ\lambda and μ\mu, then

MX+Y(s)=exp((λ+μ)(es1)),M_{X+Y}(s)=\exp((\lambda+\mu)(e^s-1)),

so X+YPoisson(λ+μ)X+Y\sim\text{Poisson}(\lambda+\mu) by uniqueness.

Example 4: Chernoff bound logic

For s>0s>0,

P(Xa)=P(esXesa)esaMX(s).\mathbb{P}(X\ge a)=\mathbb{P}(e^{sX}\ge e^{sa})\le e^{-sa}M_X(s).

Optimising over ss gives exponential tail bounds. This is why MGFs appear in concentration inequalities, large deviations, and stress testing.

Common confusions and pitfalls

"The MGF always exists." No. Heavy-tailed distributions may have MX(s)=M_X(s)=\infty for every s>0s>0. A lognormal variable has no finite positive MGF even though all positive integer moments exist.
"The value at s=0s=0 is informative." MX(0)=1M_X(0)=1 for every random variable. The information is in the behaviour around zero.
"MGFs add under independent sums." They multiply. Log-MGFs, or cumulant generating functions, add.
"Matching moments always matches distributions." Not in full generality. MGF uniqueness is stronger because it requires the transform on an interval, not only a sequence of moments.
"MGF and characteristic function are the same tool." They share transform algebra, but the MGF is an exponential-moment object and can fail. The characteristic function is Fourier-based and always exists.

Where this goes next

References

  • Bertsekas, D. P., & Tsitsiklis, J. N. (2008). Introduction to Probability (2nd ed.). Athena Scientific. Ch. 4 §4.4 (Transforms).

Exercises

Test your understanding with 3 exercises for this lesson.