CONTENTS

Characteristic Functions

Motivation: why this matters in quant finance

Many asset-price models are easy to describe through the distribution of log returns but hard to price from a density. Heston, Variance Gamma, CGMY, and other affine or jump models often provide a closed-form characteristic function while the density is unavailable or expensive to evaluate. Fourier pricing methods exploit this: price from the transform, not from the density.
A characteristic function is also the transform that does not fail. Moment generating functions require exponential moments and can explode for heavy-tailed variables. Characteristic functions always exist because eitX=1|e^{itX}|=1. That makes them the right language for the Central Limit Theorem, weak convergence, and heavy-tailed modelling.

Bertsekas develops real transforms / MGFs in Ch. 4.4 and uses them to identify distributions, read moments, and handle sums of independent random variables. Characteristic functions follow the same transform logic with ss replaced by itit, but their interpretation is Fourier rather than exponential-growth.

The informal idea

The characteristic function of XX is

φX(t)=E[eitX],tR.\varphi_X(t)=\mathbb{E}[e^{itX}],\qquad t\in\mathbb{R}.

Using Euler's identity,

eitX=cos(tX)+isin(tX),e^{itX}=\cos(tX)+i\sin(tX),

so φX\varphi_X records how the distribution of XX oscillates at every frequency tt. This is why the characteristic function is a Fourier transform of the probability law.

The main computational gift is that independent sums become products:

φX+Y(t)=φX(t)φY(t)\varphi_{X+Y}(t)=\varphi_X(t)\varphi_Y(t)

when XX and YY are independent. Convolution in the density world becomes multiplication in the transform world.

Formal definition

For a real-valued random variable XX, the characteristic function is
φX(t)=E[eitX],tR.\varphi_X(t)=\mathbb{E}[e^{itX}],\qquad t\in\mathbb{R}.

If XX has density fXf_X, then

φX(t)=eitxfX(x)dx.\varphi_X(t)=\int_{-\infty}^{\infty} e^{itx}f_X(x)\,dx.

If XX is discrete,

φX(t)=xeitxpX(x).\varphi_X(t)=\sum_x e^{itx}p_X(x).

Unlike the MGF MX(s)=E[esX]M_X(s)=\mathbb{E}[e^{sX}], this expectation is always finite because eitX=1|e^{itX}|=1.

Key properties

Boundedness and normalisation

φX(t)1,φX(0)=1.|\varphi_X(t)|\le1, \qquad \varphi_X(0)=1.

The bound is the reason the transform is always available.

Affine transformations

If Y=aX+bY=aX+b, then

φY(t)=eitbφX(at).\varphi_Y(t)=e^{itb}\varphi_X(at).

Shifting a distribution rotates the transform; scaling changes the frequency.

Independent sums

For independent XX and YY,

φX+Y(t)=φX(t)φY(t).\varphi_{X+Y}(t)=\varphi_X(t)\varphi_Y(t).

This is the same multiplication principle Bertsekas develops for MGFs, now in a domain where existence is automatic.

Moments from derivatives

If E[Xk]<\mathbb{E}[|X|^k]<\infty, then

φX(k)(0)=ikE[Xk].\varphi_X^{(k)}(0)=i^k\mathbb{E}[X^k].

The condition is important. The characteristic function can exist even when the moment does not.

Uniqueness

The characteristic function determines the distribution. If φX(t)=φY(t)\varphi_X(t)=\varphi_Y(t) for all tt, then XX and YY have the same law.

Convergence

Lévy's continuity theorem says that pointwise convergence of characteristic functions, with a limit continuous at zero, implies convergence in distribution. This is the engine behind the characteristic-function proof of the CLT.

Worked examples

Example 1: normal characteristic function

For XN(μ,σ2)X\sim\mathcal{N}(\mu,\sigma^2),

φX(t)=exp(iμt12σ2t2).\varphi_X(t)=\exp\left(i\mu t-\frac12\sigma^2t^2\right).

The linear term carries location; the quadratic decay carries variance.

Example 2: sum of independent normals

Let XN(μX,σX2)X\sim\mathcal{N}(\mu_X,\sigma_X^2) and YN(μY,σY2)Y\sim\mathcal{N}(\mu_Y,\sigma_Y^2) be independent. Then

φX+Y(t)=exp(iμXt12σX2t2)exp(iμYt12σY2t2)=exp(i(μX+μY)t12(σX2+σY2)t2).\begin{aligned} \varphi_{X+Y}(t) &=\exp\left(i\mu_Xt-\frac12\sigma_X^2t^2\right) \exp\left(i\mu_Yt-\frac12\sigma_Y^2t^2\right)\\ &=\exp\left(i(\mu_X+\mu_Y)t-\frac12(\sigma_X^2+\sigma_Y^2)t^2\right). \end{aligned}

By uniqueness, X+YX+Y is normal with mean μX+μY\mu_X+\mu_Y and variance σX2+σY2\sigma_X^2+\sigma_Y^2.

Example 3: CLT mechanism

If XiX_i are i.i.d. with mean 00 and variance 11, then near zero

φX(u)=112u2+o(u2).\varphi_X(u)=1-\frac12u^2+o(u^2).

For Sn=(X1++Xn)/nS_n=(X_1+\cdots+X_n)/\sqrt{n},

φSn(t)=(φX(t/n))net2/2,\varphi_{S_n}(t)=\left(\varphi_X(t/\sqrt{n})\right)^n \to e^{-t^2/2},

which is the characteristic function of N(0,1)\mathcal{N}(0,1).

Example 4: option pricing from a log-price CF

Suppose a model gives a closed form for φlogST(t)\varphi_{\log S_T}(t) but not for the density of STS_T. Fourier methods damp the payoff, transform it, multiply by the model CF, and invert numerically. The density never has to be written explicitly. This is the practical reason characteristic functions appear in Heston-style pricing engines.

Common confusions and pitfalls

"Complex-valued means non-probabilistic." The complex number is a transform coordinate. Probabilities and densities are recovered through inversion or distributional identification.
"The CF and MGF are interchangeable." They coincide only formally under s=its=it. The MGF may fail to exist; the CF always exists.
"A CF gives moments automatically." Only if the corresponding absolute moments exist. Cauchy variables have characteristic functions but no mean or variance.
"Multiplication works for any sum." Multiplication of transforms requires independence. Dependence requires a joint characteristic function.
"Knowing a few CF values is enough." The uniqueness theorem needs the full function, not isolated frequencies.

Where this goes next

References

  • Bertsekas, D. P., & Tsitsiklis, J. N. (2008). Introduction to Probability (2nd ed.). Athena Scientific. Ch. 4 §4.4 (Transforms). Bertsekas develops MGFs rather than characteristic functions; this note uses the same transform principles with the standard Fourier-domain extension.

Exercises

Test your understanding with 3 exercises for this lesson.