Moment Generating Functions
Motivation: why this matters in quant finance
The moment generating function packages a distribution's moments into one expectation:
MX(s)=E[esX].
In Black-Scholes calculations, this is not decorative. The identity
E[eσWT]=eσ2T/2
is an MGF evaluation for a normal random variable. It is the exponential-moment calculation behind lognormal prices, convexity corrections, and the −σ2/2 drift adjustment in geometric Brownian motion.
Bertsekas presents transforms as alternative representations of probability laws: not especially intuitive at first, but powerful for moments, distribution identification, and sums of independent variables. This lesson follows that route. The MGF is useful precisely when exponential moments exist; when they do not, the nearby
characteristic functions lesson takes over.
The informal idea
Expand the exponential:
esX=1+sX+2!s2X2+3!s3X3+⋯.
Taking expectations gives
MX(s)=1+sE[X]+2!s2E[X2]+3!s3E[X3]+⋯.
So the MGF is a generating function for moments. Differentiating at zero reads the moments back out. The catch is existence: E[esX] may be infinite away from zero.
Formal definition
The
moment generating function of a random variable
X is
MX(s)=E[esX],
for all real s where the expectation is finite. Its domain is
DX={s∈R:MX(s)<∞}.
If DX contains an open interval around 0, the MGF is especially useful: it determines the distribution and its derivatives at zero give the moments.
For a discrete random variable,
MX(s)=x∑esxpX(x).
For a continuous random variable with density,
MX(s)=∫−∞∞esxfX(x)dx.
Key properties
Moment reading
If the MGF exists in a neighbourhood of zero, then
MX(k)(0)=E[Xk].
In particular,
E[X]=MX′(0),Var(X)=MX′′(0)−(MX′(0))2.
Affine transformations
If Y=aX+b, then
MY(s)=esbMX(as).
Bertsekas uses this to derive the MGF of a general normal from the standard normal.
Independent sums
If X and Y are independent, then
MX+Y(s)=MX(s)MY(s).
This is why MGFs make sums of independent Poisson, binomial, and normal variables easy.
Uniqueness
If MX(s)=MY(s) on an open interval around zero, then X and Y have the same distribution.
Cumulants
The log-MGF
KX(s)=logMX(s)
is the cumulant generating function. Independent sums add cumulants because MGFs multiply and logarithms turn products into sums.
Worked examples
Example 1: normal MGF
For X∼N(μ,σ2),
MX(s)=exp(μs+21σ2s2).
Then MX′(0)=μ and MX′′(0)=μ2+σ2, so Var(X)=σ2.
Example 2: Black-Scholes exponential moment
If WT∼N(0,T), then
E[eσWT]=MWT(σ)=eσ2T/2.
Thus
E[S0e(μ−σ2/2)T+σWT]=S0e(μ−σ2/2)Teσ2T/2=S0eμT.
The drift correction is not a convention; it is the MGF calculation that keeps the expected stock price at S0eμT.
Example 3: sum of independent Poissons
For X∼Poisson(λ),
MX(s)=exp(λ(es−1)).
If X and Y are independent Poisson variables with means λ and μ, then
MX+Y(s)=exp((λ+μ)(es−1)),
so X+Y∼Poisson(λ+μ) by uniqueness.
Example 4: Chernoff bound logic
For s>0,
P(X≥a)=P(esX≥esa)≤e−saMX(s).
Optimising over s gives exponential tail bounds. This is why MGFs appear in concentration inequalities, large deviations, and stress testing.
Common confusions and pitfalls
"The MGF always exists." No. Heavy-tailed distributions may have
MX(s)=∞ for every
s>0. A lognormal variable has no finite positive MGF even though all positive integer moments exist.
"The value at s=0 is informative." MX(0)=1 for every random variable. The information is in the behaviour around zero.
"MGFs add under independent sums." They multiply. Log-MGFs, or cumulant generating functions, add.
"Matching moments always matches distributions." Not in full generality. MGF uniqueness is stronger because it requires the transform on an interval, not only a sequence of moments.
"MGF and characteristic function are the same tool." They share transform algebra, but the MGF is an exponential-moment object and can fail. The characteristic function is Fourier-based and always exists.
Where this goes next
- Characteristic Functions: The always-defined Fourier analogue.
- Normal Distribution: Its MGF drives lognormal and Black-Scholes calculations.
- Central Limit Theorem: Can be proved using MGFs under exponential-moment conditions.
- Law of Large Numbers: Explains why sample averages converge to expectations.
- The Derivation of the Black-Scholes Formula: Uses exponential-normal expectations in the closed-form pricing integral.
References
- Bertsekas, D. P., & Tsitsiklis, J. N. (2008). Introduction to Probability (2nd ed.). Athena Scientific. Ch. 4 §4.4 (Transforms).