CONTENTS

Itô's Lemma

Motivation: why this matters in quant finance

Every derivative priced in modern finance — a European call, an interest-rate swap, a credit default swap, a barrier option — is a function of a stochastic process. If the underlying stock follows dSt=μStdt+σStdWtdS_t = \mu S_t\,dt + \sigma S_t\,dW_t, then the option price V(t,St)V(t, S_t) is a function of time and a diffusion. To compute dVdV, to construct a hedge, or to derive a pricing PDE, you need a chain rule.
The ordinary chain rule does not work. If you write down V(t,St)V(t, S_t) and apply dV=Vtdt+VSdStdV = V_t\,dt + V_S\,dS_t, you will get the wrong answer in a way that ruins every calculation downstream. The reason is subtle but fatal: the increments of Brownian motion scale as dt\sqrt{dt}, not dtdt, so the squared increments (dWt)2(dW_t)^2 contribute at order dtdt and do not vanish in the limit. A second-order term that ordinary calculus discards is, in stochastic calculus, a first-order correction.
Itô's lemma is the stochastic chain rule that keeps track of this correction. Written in its most useful form, for XtX_t satisfying dXt=adt+bdWtdX_t = a\,dt + b\,dW_t and V=V(t,Xt)V = V(t, X_t):
dV=(Vt+aVx+12b2Vxx)dt+bVxdWtdV = \left(V_t + a\,V_x + \tfrac{1}{2}b^2 V_{xx}\right)dt + b\,V_x\,dW_t
The extra term 12b2Vxxdt\tfrac{1}{2}b^2 V_{xx}\,dt — absent from ordinary calculus — is the Itô correction. Every hallmark of the Black-Scholes framework routes through it: the 12σ2-\tfrac{1}{2}\sigma^2 in the log-return drift of geometric Brownian motion, the 12σ2S2VSS\tfrac{1}{2}\sigma^2 S^2 V_{SS} term in the Black-Scholes PDE, the gamma-theta balance in delta hedging. Lose the correction and you lose the entire pricing theory.

The informal idea

Why does ordinary calculus fail, and what exactly does Itô fix? The argument has three steps.

Step 1 — Taylor expansion. For a smooth ff and small Δx\Delta x:
f(x+Δx)f(x)=f(x)Δx+12f(x)(Δx)2+O((Δx)3)f(x + \Delta x) - f(x) = f'(x)\Delta x + \tfrac{1}{2}f''(x)(\Delta x)^2 + O((\Delta x)^3)

In ordinary calculus, when x(t)x(t) is smooth, the increment Δx\Delta x scales as Δt\Delta t, so (Δx)2=O((Δt)2)(\Delta x)^2 = O((\Delta t)^2). When we divide by Δt\Delta t and take the limit, the second-order term vanishes. Only f(x)dxf'(x)\,dx survives — this is the ordinary chain rule.

Step 2 — Brownian scale is different. For Xt=WtX_t = W_t, the increment ΔW\Delta W is a Gaussian with standard deviation Δt\sqrt{\Delta t}. So ΔW=O(Δt)\Delta W = O(\sqrt{\Delta t}), and:
(ΔW)2=O(Δt)(\Delta W)^2 = O(\Delta t)
The second-order term does not vanish — it has the same order as the deterministic term Δt\Delta t. When we sum the Taylor expansion over many small intervals and take the limit, the 12f(ΔW)2\tfrac{1}{2}f''(\Delta W)^2 contribution adds up to a finite, deterministic correction.
Step 3 — Quadratic variation pins down the correction. The lesson on Brownian motion established that (Wti+1Wti)2T\sum (W_{t_{i+1}} - W_{t_i})^2 \to T in L2L^2 — the squared increments accumulate to the elapsed time. In differential notation, (dWt)2=dt(dW_t)^2 = dt (as a rule for manipulating differentials, not as a pointwise algebraic identity). So the Taylor correction for f(Wt)f(W_t) becomes:
12f(Wt)(dWt)2=12f(Wt)dt\tfrac{1}{2}f''(W_t)(dW_t)^2 = \tfrac{1}{2}f''(W_t)\,dt

This is the Itô correction. It is entirely a consequence of the (dW)2=dt(dW)^2 = dt rule, which in turn is a consequence of Var(ΔW)=Δt\operatorname{Var}(\Delta W) = \Delta t.

The canonical counterexample

Consider Yt=Wt2Y_t = W_t^2. The ordinary chain rule would give dYt=2WtdWtdY_t = 2W_t\,dW_t, from which E[dYt]=0\mathbb{E}[dY_t] = 0 (Itô integrals have mean zero), implying E[Wt2]\mathbb{E}[W_t^2] is constant. But we know E[Wt2]=t\mathbb{E}[W_t^2] = t — it grows linearly. Ordinary calculus is off by exactly dtdt.

Itô's lemma applied to f(x)=x2f(x) = x^2 (so f(x)=2xf'(x) = 2x, f(x)=2f''(x) = 2) gives:

dYt=2WtdWt+122(dWt)2=2WtdWt+dtdY_t = 2W_t\,dW_t + \tfrac{1}{2}\cdot 2 \cdot (dW_t)^2 = 2W_t\,dW_t + dt

Taking expectations: E[dYt]=0+dt\mathbb{E}[dY_t] = 0 + dt, so E[Wt2]=t\mathbb{E}[W_t^2] = t. ✓

The missing dtdt is the Itô correction. Every apparent paradox involving "d(function of Wt)d(\text{function of } W_t)" resolves the same way.

Formal definitions

Lawler presents Itô's formula first for f(Wt)f(W_t) and then extends it to time-dependent functions and diffusions. In the notation used across this vault, let (Wt)t0(W_t)_{t \ge 0} be a standard Brownian motion on a filtered probability space, and let (Xt)t0(X_t)_{t \ge 0} be an Itô process:
Xt=X0+0ta(s,ω)ds+0tb(s,ω)dWsX_t = X_0 + \int_0^t a(s, \omega)\,ds + \int_0^t b(s, \omega)\,dW_s

where aa and bb are adapted processes satisfying the usual integrability conditions (0Tasds<\int_0^T |a_s|\,ds < \infty and 0Tbs2ds<\int_0^T b_s^2\,ds < \infty almost surely). In differential notation:

dXt=atdt+btdWtdX_t = a_t\,dt + b_t\,dW_t
Itô's lemma (one-dimensional). Let V(t,x):[0,)×RRV(t, x): [0, \infty) \times \mathbb{R} \to \mathbb{R} be a function with continuous partial derivatives VtV_t, VxV_x, and VxxV_{xx}. Then the process Yt:=V(t,Xt)Y_t := V(t, X_t) is itself an Itô process, and:
dYt=(Vt+atVx+12bt2Vxx)dt+btVxdWt\boxed{\,dY_t = \left(V_t + a_t V_x + \tfrac{1}{2}b_t^2 V_{xx}\right)dt + b_t V_x\,dW_t\,}

where the partial derivatives are evaluated at (t,Xt)(t, X_t). Equivalently, in integral form:

V(t,Xt)=V(0,X0)+0t(Vs+asVx+12bs2Vxx)ds+0tbsVxdWsV(t, X_t) = V(0, X_0) + \int_0^t \left(V_s + a_s V_x + \tfrac{1}{2}b_s^2 V_{xx}\right)ds + \int_0^t b_s V_x\,dW_s
The first three terms form the drift; the last term is the diffusion. The Itô correction is the 12b2Vxx\tfrac{1}{2}b^2 V_{xx} term in the drift.

Differential rules used in practice

Itô's lemma is usually applied mechanically via the following multiplication rules, which encode (dWt)2=dt(dW_t)^2 = dt and the vanishing of all other second-order products:

dtdtdWtdW_t
dtdt0000
dWtdW_t00dtdt

To apply Itô's lemma, write out the second-order Taylor expansion, substitute dXt=adt+bdWtdX_t = a\,dt + b\,dW_t, square it, and drop all terms that vanish by the table. What remains is the Itô formula.

Sketch of the proof

Partition [0,t][0, t] into 0=t0<t1<<tn=t0 = t_0 < t_1 < \cdots < t_n = t with mesh δ=maxi(ti+1ti)\delta = \max_i(t_{i+1} - t_i). Write:

V(t,Xt)V(0,X0)=i=0n1[V(ti+1,Xti+1)V(ti,Xti)]V(t, X_t) - V(0, X_0) = \sum_{i=0}^{n-1} \big[V(t_{i+1}, X_{t_{i+1}}) - V(t_i, X_{t_i})\big]

Apply a two-dimensional Taylor expansion to each summand:

V(ti+1,Xti+1)V(ti,Xti)=VtΔti+VxΔXi+12Vxx(ΔXi)2+VtxΔtiΔXi+12Vtt(Δti)2+higher orderV(t_{i+1}, X_{t_{i+1}}) - V(t_i, X_{t_i}) = V_t\Delta t_i + V_x\Delta X_i + \tfrac{1}{2}V_{xx}(\Delta X_i)^2 + V_{tx}\Delta t_i \Delta X_i + \tfrac{1}{2}V_{tt}(\Delta t_i)^2 + \text{higher order}

with partial derivatives evaluated at (ti,Xti)(t_i, X_{t_i}). Summing and taking δ0\delta \to 0:

  • VtΔti0tVtds\sum V_t\Delta t_i \to \int_0^t V_t\,ds (standard Riemann integral).
  • VxΔXi0tVxdXs=0taVxds+0tbVxdWs\sum V_x\Delta X_i \to \int_0^t V_x\,dX_s = \int_0^t a V_x\,ds + \int_0^t b V_x\,dW_s (Itô integral).
  • Vxx(ΔXi)20tVxxb2ds\sum V_{xx}(\Delta X_i)^2 \to \int_0^t V_{xx} b^2\,ds (by quadratic variation (dW)2=dt(dW)^2 = dt; the cross-term ΔtiΔXi\Delta t_i\Delta X_i and the pure (Δti)2(\Delta t_i)^2 vanish).
  • Higher-order terms vanish.

Combining and halving the VxxV_{xx} term gives the Itô formula. A full proof replaces each heuristic step with an L2L^2 convergence argument; the structure above is the mnemonic every practitioner uses.

Key properties

It is a second-order chain rule

The ordinary chain rule keeps only the first derivative because smooth increments satisfy (dx)2=o(dt)(dx)^2=o(dt). Brownian-driven increments satisfy (dWt)2=dt(dW_t)^2=dt in the quadratic-variation sense, so the second derivative contributes to drift.

It separates drift from innovation

After applying Itô's lemma, every transformed process has a dtdt part and a dWtdW_t part. In finance, the dWtdW_t coefficient is the hedgeable shock exposure, while the dtdt coefficient is the drift to be eliminated, priced, or interpreted.

Convexity controls the sign of the correction

The correction term

12bt2Vxxdt\frac{1}{2}b_t^2V_{xx}\,dt

has the sign of VxxV_{xx}. Convex functions receive positive drift from volatility; concave functions receive negative drift. This is the calculus behind gamma exposure and volatility drag.

Time-dependent functions add a separate theta term

For V(t,Xt)V(t,X_t), the VtdtV_t\,dt term is not stochastic. In option pricing this is theta: the deterministic time-decay component that must balance gamma and financing terms in the Black-Scholes PDE.

Worked examples

Example 1 — The log of geometric Brownian motion

This is the single most-referenced application of Itô's lemma. Let dSt=μStdt+σStdWtdS_t = \mu S_t\,dt + \sigma S_t\,dW_t (so a=μSta = \mu S_t, b=σStb = \sigma S_t), and set V(t,S)=lnSV(t, S) = \ln S. Then Vt=0V_t = 0, VS=1/SV_S = 1/S, VSS=1/S2V_{SS} = -1/S^2. Itô's lemma gives:

d(lnSt)=(0+μSt1St+12σ2St2(1St2))dt+σSt1StdWt=(μ12σ2)dt+σdWtd(\ln S_t) = \left(0 + \mu S_t \cdot \tfrac{1}{S_t} + \tfrac{1}{2}\sigma^2 S_t^2 \cdot \left(-\tfrac{1}{S_t^2}\right)\right)dt + \sigma S_t \cdot \tfrac{1}{S_t}\,dW_t = \left(\mu - \tfrac{1}{2}\sigma^2\right)dt + \sigma\,dW_t
Integrate and exponentiate to get the closed-form GBM solution St=S0exp((μ12σ2)t+σWt)S_t = S_0\exp((\mu - \tfrac{1}{2}\sigma^2)t + \sigma W_t). The 12σ2-\tfrac{1}{2}\sigma^2 is entirely from the Itô correction 12b2Vxx=12(σS)2(1/S2)=12σ2\tfrac{1}{2}b^2 V_{xx} = \tfrac{1}{2}(\sigma S)^2 (-1/S^2) = -\tfrac{1}{2}\sigma^2.

Example 2 — Wt2W_t^2 and Wt3W_t^3

Setting V(W)=W2V(W) = W^2: V=2WV' = 2W, V=2V'' = 2. With dXt=dWtdX_t = dW_t (so a=0a = 0, b=1b = 1):

d(Wt2)=(0+0+1212)dt+12WtdWt=dt+2WtdWtd(W_t^2) = (0 + 0 + \tfrac{1}{2}\cdot 1 \cdot 2)\,dt + 1\cdot 2W_t\,dW_t = dt + 2W_t\,dW_t

Taking expectations: E[Wt2]=0+0t1ds=t\mathbb{E}[W_t^2] = 0 + \int_0^t 1\,ds = t, confirming the known variance.

Setting V(W)=W3V(W) = W^3: V=3W2V' = 3W^2, V=6WV'' = 6W. Same a=0a = 0, b=1b = 1:

d(Wt3)=126Wtdt+3Wt2dWt=3Wtdt+3Wt2dWtd(W_t^3) = \tfrac{1}{2}\cdot 6W_t\,dt + 3W_t^2\,dW_t = 3W_t\,dt + 3W_t^2\,dW_t

So E[Wt3]=0t3E[Ws]ds=0\mathbb{E}[W_t^3] = \int_0^t 3\mathbb{E}[W_s]\,ds = 0. (The mean-zero, symmetric structure of Brownian motion shows up as an automatic cancellation.)

Example 3 — Exponential martingale and the Doléans-Dade exponential

Let V(t,W)=exp(σW12σ2t)V(t, W) = \exp(\sigma W - \tfrac{1}{2}\sigma^2 t). Then Vt=12σ2VV_t = -\tfrac{1}{2}\sigma^2 V, VW=σVV_W = \sigma V, VWW=σ2VV_{WW} = \sigma^2 V. With dXt=dWtdX_t = dW_t:

dV=(12σ2V+0+12σ2V)dt+σVdWt=σVdWtdV = \left(-\tfrac{1}{2}\sigma^2 V + 0 + \tfrac{1}{2}\sigma^2 V\right)dt + \sigma V\,dW_t = \sigma V\,dW_t
The drift cancels exactly. VtV_t has no drift, so VtV_t is a martingale — this is the exponential martingale, the foundation of Girsanov's theorem and the simplest non-trivial Brownian martingale. The cancellation is the Itô correction: the +12σ2+\tfrac{1}{2}\sigma^2 from 12b2Vxx\tfrac{1}{2}b^2 V_{xx} precisely offsets the 12σ2-\tfrac{1}{2}\sigma^2 in VtV_t.

Example 4 — Derivation of the Black-Scholes PDE

This is the payoff. Let the stock follow dSt=μStdt+σStdWtdS_t = \mu S_t\,dt + \sigma S_t\,dW_t, and let V(t,St)V(t, S_t) be the value of a derivative. Applying Itô:

dV=(Vt+μSVS+12σ2S2VSS)dt+σSVSdWtdV = \left(V_t + \mu S V_S + \tfrac{1}{2}\sigma^2 S^2 V_{SS}\right)dt + \sigma S V_S\,dW_t

Now construct a self-financing hedge portfolio Π=VΔS\Pi = V - \Delta\cdot S where Δ=VS\Delta = V_S. The stochastic terms cancel:

dΠ=dVVSdS=(Vt+12σ2S2VSS)dtd\Pi = dV - V_S\,dS = \left(V_t + \tfrac{1}{2}\sigma^2 S^2 V_{SS}\right)dt

Π\Pi is instantaneously riskless, so by no-arbitrage it must earn the risk-free rate rr:

dΠ=rΠdt=r(VSVS)dtd\Pi = r\Pi\,dt = r(V - S V_S)\,dt
Equating the two expressions for dΠd\Pi yields the Black-Scholes PDE:
Vt+rSVS+12σ2S2VSSrV=0V_t + rSV_S + \tfrac{1}{2}\sigma^2 S^2 V_{SS} - rV = 0

The 12σ2S2VSS\tfrac{1}{2}\sigma^2 S^2 V_{SS} term — the gamma-times-variance — is the Itô correction. Without it there is no PDE, no closed-form option price, and no delta hedging. The entire derivative-pricing edifice rests on this single 12b2Vxx\tfrac{1}{2}b^2 V_{xx}.

Example 5 — Short numerical check

The Itô correction is not an abstraction — you can see it on your laptop in ten seconds.

# Python: d(W^2) = dt + 2W dW implies E[W_T^2] = T, not 0 import numpy as np rng = np.random.default_rng(0) T, n, N = 1.0, 1000, 100_000 dt = T / n dW = rng.normal(0, np.sqrt(dt), size=(N, n)) W = np.cumsum(dW, axis=1) # Naive "ordinary chain rule": d(W^2) = 2W dW, so W_T^2 = ∫2W dW has mean 0 naive_prediction = 0.0 # Itô: d(W^2) = dt + 2W dW, so W_T^2 = T + ∫2W dW has mean T ito_prediction = T empirical = (W[:, -1] ** 2).mean() print(f"Ordinary-calculus prediction: {naive_prediction}") print(f"Itô prediction: {ito_prediction}") print(f"Empirical mean: {empirical:.4f}") # Ordinary-calculus prediction: 0.0 # Itô prediction: 1.0 # Empirical mean: 1.0017

The empirical mean lands on the Itô prediction, not on zero. The Itô correction is the difference between a pricing model that works and one that doesn't.

Itô's lemma for multiple processes

In quant finance, many models have two or more Brownian drivers — stochastic volatility (Heston), multi-factor rates (two-factor HJM), multi-asset baskets. The multidimensional Itô formula generalises cleanly.

Let Xt=(Xt1,,Xtn)X_t = (X_t^1, \ldots, X_t^n) be an nn-dimensional Itô process driven by an mm-dimensional Brownian motion Wt=(Wt1,,Wtm)W_t = (W_t^1, \ldots, W_t^m):

dXti=atidt+k=1mbtikdWtkdX_t^i = a_t^i\,dt + \sum_{k=1}^m b_t^{ik}\,dW_t^k

For a twice-differentiable V(t,x1,,xn)V(t, x_1, \ldots, x_n):

dV=Vtdt+iVxidXti+12i,jVxixjd[Xi,Xj]tdV = V_t\,dt + \sum_i V_{x_i}\,dX_t^i + \tfrac{1}{2}\sum_{i, j}V_{x_i x_j}\,d[X^i, X^j]_t
where the quadratic covariation is:
d[Xi,Xj]t=k=1mbtikbtjkdtd[X^i, X^j]_t = \sum_{k=1}^m b_t^{ik}b_t^{jk}\,dt

When the driving Brownians are correlated with d[Wk,W]t=ρkdtd[W^k, W^\ell]_t = \rho_{k\ell}\,dt, the covariation formula generalises to d[Xi,Xj]t=k,btikbtjρkdtd[X^i, X^j]_t = \sum_{k, \ell} b_t^{ik}b_t^{j\ell}\rho_{k\ell}\,dt. This is how the Heston model, the SABR model, and the Margrabe exchange-option formula are derived.

Common confusions and pitfalls

"Itô's lemma is just the chain rule with an extra term." It is and it isn't. The form looks like a chain rule, but the content is different: ordinary calculus treats (dx)2(dx)^2 as negligible, while Itô calculus treats (dW)2(dW)^2 as a first-order contribution. Forgetting this distinction — say, computing d(lnSt)=dS/Sd(\ln S_t) = dS/S without the 12σ2dt-\tfrac{1}{2}\sigma^2\,dt — is the single most common mistake in stochastic calculus, and the resulting pricing model will be off by 12σ2\tfrac{1}{2}\sigma^2 per year (a 3% per-year drift error at 25% volatility).
"(dWt)2=dt(dW_t)^2 = dt is an algebraic identity." It is not. You are not squaring a random number and getting a deterministic one. The identity is shorthand for the L2L^2 limit (ΔWi)2T\sum(\Delta W_i)^2 \to T under refinement. Inside an Itô calculation, treating (dWt)2(dW_t)^2 as dtdt is a valid manipulation because what you actually compute is an integral against dtdt, not a pointwise value. Outside that context the "identity" is meaningless.
"Itô's lemma applies to any continuous process." No. It requires an Itô process — an adapted integrand structure dX=adt+bdWdX = a\,dt + b\,dW with well-defined Itô integrals. For processes with jumps (Lévy, jump-diffusion), a generalised version with a compensator term is needed. For processes with infinite quadratic variation other than tt (rough paths, fractional Brownian motion with H1/2H \ne 1/2), Itô's lemma in this form fails and specialised calculi apply.
"The Itô correction sign is always negative." The correction is +12b2Vxx+\tfrac{1}{2}b^2 V_{xx}, which has the sign of VxxV_{xx}. For V=lnSV = \ln S, VSS=1/S2<0V_{SS} = -1/S^2 < 0, giving a negative correction 12σ2-\tfrac{1}{2}\sigma^2. For V=S2V = S^2, VSS=2>0V_{SS} = 2 > 0, giving a positive correction. The sign of the correction is the sign of the convexity of the payoff — which is why convex payoffs (options) benefit from volatility and concave payoffs (short options) suffer.
"Itô's lemma gives the expectation of V(t,Xt)V(t, X_t)." It does not. It gives the SDE satisfied by V(t,Xt)V(t, X_t). Taking expectations and using that Itô integrals have mean zero gives the ODE ddtE[V]=E[Vt+aVx+12b2Vxx]\tfrac{d}{dt}\mathbb{E}[V] = \mathbb{E}[V_t + a V_x + \tfrac{1}{2}b^2 V_{xx}], but solving this ODE still requires knowing enough about the joint distribution of XtX_t — usually the Feynman-Kac formula or a direct distributional argument.
"Stratonovich calculus is a competing framework." Stratonovich integration, which uses a midpoint convention, does satisfy the ordinary chain rule. The catch for finance is that self-financing trading gains are naturally left-endpoint, non-anticipative objects, so Itô is the default pricing convention. Knowing the difference matters when reading non-finance stochastic-calculus references.

Where this goes next

  • Geometric Brownian Motion: The closed-form GBM solution is pure Itô — the 12σ2-\tfrac{1}{2}\sigma^2 is the Itô correction applied to lnS\ln S.
  • Stochastic Differential Equations: The integral-equation framework in which Itô's lemma is formally defined. Also supplies the existence and uniqueness theorems needed to apply it.
  • Black-Scholes PDE: The capstone application — Itô's lemma applied to V(t,St)V(t, S_t) plus a self-financing hedge yields the pricing PDE.
  • Girsanov's Theorem: Uses the exponential martingale from Example 3 to change Brownian drift under an absolutely continuous measure change.
  • Infinitesimal Generators and Kolmogorov Equations: Packages the Itô drift operator into the PDE machinery behind Feynman-Kac.
  • Jump-Diffusion Processes: Generalises Itô's lemma to include jumps (Merton's jump-diffusion). The extra term is a compensator integral against the Poisson random measure.

References

  • Lawler, G. F. (2023). Stochastic Calculus: An Introduction with Applications. Ch. 3 §3.3 (Itô's formula), §3.4 (More versions of Itô's formula), especially Examples 3.3.1-3.3.2 and 3.4.1.
  • Albin, P., Hamza, K., & Klebaner, F. C. (2025). Problems and Solutions in Stochastic Calculus with Applications. World Scientific. Ch. 4 (Brownian Motion Calculus) — supporting exercise checks.

Exercises

Test your understanding with 3 exercises for this lesson.