Every derivative priced in modern finance — a European call, an interest-rate swap, a credit default swap, a barrier option — is a function of a stochastic process. If the underlying stock follows dSt=μStdt+σStdWt, then the option price V(t,St) is a function of time and a diffusion. To compute dV, to construct a hedge, or to derive a pricing PDE, you need a chain rule.
The ordinary chain rule does not work. If you write down V(t,St) and apply dV=Vtdt+VSdSt, you will get the wrong answer in a way that ruins every calculation downstream. The reason is subtle but fatal: the increments of Brownian motion scale as dt, not dt, so the squared increments (dWt)2 contribute at order dt and do not vanish in the limit. A second-order term that ordinary calculus discards is, in stochastic calculus, a first-order correction.
Itô's lemma is the stochastic chain rule that keeps track of this correction. Written in its most useful form, for Xt satisfying dXt=adt+bdWt and V=V(t,Xt):
dV=(Vt+aVx+21b2Vxx)dt+bVxdWt
The extra term 21b2Vxxdt — absent from ordinary calculus — is the Itô correction. Every hallmark of the Black-Scholes framework routes through it: the −21σ2 in the log-return drift of geometric Brownian motion, the 21σ2S2VSS term in the Black-Scholes PDE, the gamma-theta balance in delta hedging. Lose the correction and you lose the entire pricing theory.
The informal idea
Why does ordinary calculus fail, and what exactly does Itô fix? The argument has three steps.
Step 1 — Taylor expansion. For a smooth f and small Δx:
f(x+Δx)−f(x)=f′(x)Δx+21f′′(x)(Δx)2+O((Δx)3)
In ordinary calculus, when x(t) is smooth, the increment Δx scales as Δt, so (Δx)2=O((Δt)2). When we divide by Δt and take the limit, the second-order term vanishes. Only f′(x)dx survives — this is the ordinary chain rule.
Step 2 — Brownian scale is different. For Xt=Wt, the increment ΔW is a Gaussian with standard deviation Δt. So ΔW=O(Δt), and:
(ΔW)2=O(Δt)
The second-order term does not vanish — it has the same order as the deterministic term Δt. When we sum the Taylor expansion over many small intervals and take the limit, the 21f′′(ΔW)2 contribution adds up to a finite, deterministic correction.
Step 3 — Quadratic variation pins down the correction. The lesson on Brownian motion established that ∑(Wti+1−Wti)2→T in L2 — the squared increments accumulate to the elapsed time. In differential notation, (dWt)2=dt (as a rule for manipulating differentials, not as a pointwise algebraic identity). So the Taylor correction for f(Wt) becomes:
21f′′(Wt)(dWt)2=21f′′(Wt)dt
This is the Itô correction. It is entirely a consequence of the (dW)2=dt rule, which in turn is a consequence of Var(ΔW)=Δt.
The canonical counterexample
Consider Yt=Wt2. The ordinary chain rule would give dYt=2WtdWt, from which E[dYt]=0 (Itô integrals have mean zero), implying E[Wt2] is constant. But we know E[Wt2]=t — it grows linearly. Ordinary calculus is off by exactly dt.
Itô's lemma applied to f(x)=x2 (so f′(x)=2x, f′′(x)=2) gives:
dYt=2WtdWt+21⋅2⋅(dWt)2=2WtdWt+dt
Taking expectations: E[dYt]=0+dt, so E[Wt2]=t. ✓
The missing dt is the Itô correction. Every apparent paradox involving "d(function of Wt)" resolves the same way.
Formal definitions
Lawler presents Itô's formula first for f(Wt) and then extends it to time-dependent functions and diffusions. In the notation used across this vault, let (Wt)t≥0 be a standard Brownian motion on a filtered probability space, and let (Xt)t≥0 be an Itô process:
Xt=X0+∫0ta(s,ω)ds+∫0tb(s,ω)dWs
where a and b are adapted processes satisfying the usual integrability conditions (∫0T∣as∣ds<∞ and ∫0Tbs2ds<∞ almost surely). In differential notation:
dXt=atdt+btdWt
Itô's lemma (one-dimensional). Let V(t,x):[0,∞)×R→R be a function with continuous partial derivatives Vt, Vx, and Vxx. Then the process Yt:=V(t,Xt) is itself an Itô process, and:
dYt=(Vt+atVx+21bt2Vxx)dt+btVxdWt
where the partial derivatives are evaluated at (t,Xt). Equivalently, in integral form:
The first three terms form the drift; the last term is the diffusion. The Itô correction is the 21b2Vxx term in the drift.
Differential rules used in practice
Itô's lemma is usually applied mechanically via the following multiplication rules, which encode (dWt)2=dt and the vanishing of all other second-order products:
⋅
dt
dWt
dt
0
0
dWt
0
dt
To apply Itô's lemma, write out the second-order Taylor expansion, substitute dXt=adt+bdWt, square it, and drop all terms that vanish by the table. What remains is the Itô formula.
Sketch of the proof
Partition [0,t] into 0=t0<t1<⋯<tn=t with mesh δ=maxi(ti+1−ti). Write:
∑Vxx(ΔXi)2→∫0tVxxb2ds (by quadratic variation (dW)2=dt; the cross-term ΔtiΔXi and the pure (Δti)2 vanish).
Higher-order terms vanish.
Combining and halving the Vxx term gives the Itô formula. A full proof replaces each heuristic step with an L2 convergence argument; the structure above is the mnemonic every practitioner uses.
Key properties
It is a second-order chain rule
The ordinary chain rule keeps only the first derivative because smooth increments satisfy (dx)2=o(dt). Brownian-driven increments satisfy (dWt)2=dt in the quadratic-variation sense, so the second derivative contributes to drift.
It separates drift from innovation
After applying Itô's lemma, every transformed process has a dt part and a dWt part. In finance, the dWt coefficient is the hedgeable shock exposure, while the dt coefficient is the drift to be eliminated, priced, or interpreted.
Convexity controls the sign of the correction
The correction term
21bt2Vxxdt
has the sign of Vxx. Convex functions receive positive drift from volatility; concave functions receive negative drift. This is the calculus behind gamma exposure and volatility drag.
Time-dependent functions add a separate theta term
For V(t,Xt), the Vtdt term is not stochastic. In option pricing this is theta: the deterministic time-decay component that must balance gamma and financing terms in the Black-Scholes PDE.
Worked examples
Example 1 — The log of geometric Brownian motion
This is the single most-referenced application of Itô's lemma. Let dSt=μStdt+σStdWt (so a=μSt, b=σSt), and set V(t,S)=lnS. Then Vt=0, VS=1/S, VSS=−1/S2. Itô's lemma gives:
Integrate and exponentiate to get the closed-form GBM solutionSt=S0exp((μ−21σ2)t+σWt). The −21σ2 is entirely from the Itô correction 21b2Vxx=21(σS)2(−1/S2)=−21σ2.
Example 2 — Wt2 and Wt3
Setting V(W)=W2: V′=2W, V′′=2. With dXt=dWt (so a=0, b=1):
d(Wt2)=(0+0+21⋅1⋅2)dt+1⋅2WtdWt=dt+2WtdWt
Taking expectations: E[Wt2]=0+∫0t1ds=t, confirming the known variance.
Setting V(W)=W3: V′=3W2, V′′=6W. Same a=0, b=1:
d(Wt3)=21⋅6Wtdt+3Wt2dWt=3Wtdt+3Wt2dWt
So E[Wt3]=∫0t3E[Ws]ds=0. (The mean-zero, symmetric structure of Brownian motion shows up as an automatic cancellation.)
Example 3 — Exponential martingale and the Doléans-Dade exponential
Let V(t,W)=exp(σW−21σ2t). Then Vt=−21σ2V, VW=σV, VWW=σ2V. With dXt=dWt:
dV=(−21σ2V+0+21σ2V)dt+σVdWt=σVdWt
The drift cancels exactly. Vt has no drift, so Vt is a martingale — this is the exponential martingale, the foundation of Girsanov's theorem and the simplest non-trivial Brownian martingale. The cancellation is the Itô correction: the +21σ2 from 21b2Vxx precisely offsets the −21σ2 in Vt.
Example 4 — Derivation of the Black-Scholes PDE
This is the payoff. Let the stock follow dSt=μStdt+σStdWt, and let V(t,St) be the value of a derivative. Applying Itô:
dV=(Vt+μSVS+21σ2S2VSS)dt+σSVSdWt
Now construct a self-financing hedge portfolio Π=V−Δ⋅S where Δ=VS. The stochastic terms cancel:
dΠ=dV−VSdS=(Vt+21σ2S2VSS)dt
Π is instantaneously riskless, so by no-arbitrage it must earn the risk-free rate r:
The 21σ2S2VSS term — the gamma-times-variance — is the Itô correction. Without it there is no PDE, no closed-form option price, and no delta hedging. The entire derivative-pricing edifice rests on this single 21b2Vxx.
Example 5 — Short numerical check
The Itô correction is not an abstraction — you can see it on your laptop in ten seconds.
# Python: d(W^2) = dt + 2W dW implies E[W_T^2] = T, not 0import numpy as np
rng = np.random.default_rng(0)
T, n, N = 1.0, 1000, 100_000dt = T / n
dW = rng.normal(0, np.sqrt(dt), size=(N, n))
W = np.cumsum(dW, axis=1)
# Naive "ordinary chain rule": d(W^2) = 2W dW, so W_T^2 = ∫2W dW has mean 0naive_prediction = 0.0# Itô: d(W^2) = dt + 2W dW, so W_T^2 = T + ∫2W dW has mean Tito_prediction = T
empirical = (W[:, -1] ** 2).mean()
print(f"Ordinary-calculus prediction: {naive_prediction}")
print(f"Itô prediction: {ito_prediction}")
print(f"Empirical mean: {empirical:.4f}")
# Ordinary-calculus prediction: 0.0# Itô prediction: 1.0# Empirical mean: 1.0017
The empirical mean lands on the Itô prediction, not on zero. The Itô correction is the difference between a pricing model that works and one that doesn't.
Itô's lemma for multiple processes
In quant finance, many models have two or more Brownian drivers — stochastic volatility (Heston), multi-factor rates (two-factor HJM), multi-asset baskets. The multidimensional Itô formula generalises cleanly.
Let Xt=(Xt1,…,Xtn) be an n-dimensional Itô process driven by an m-dimensional Brownian motion Wt=(Wt1,…,Wtm):
dXti=atidt+k=1∑mbtikdWtk
For a twice-differentiable V(t,x1,…,xn):
dV=Vtdt+i∑VxidXti+21i,j∑Vxixjd[Xi,Xj]t
where the quadratic covariation is:
d[Xi,Xj]t=k=1∑mbtikbtjkdt
When the driving Brownians are correlated with d[Wk,Wℓ]t=ρkℓdt, the covariation formula generalises to d[Xi,Xj]t=∑k,ℓbtikbtjℓρkℓdt. This is how the Heston model, the SABR model, and the Margrabe exchange-option formula are derived.
Common confusions and pitfalls
"Itô's lemma is just the chain rule with an extra term." It is and it isn't. The form looks like a chain rule, but the content is different: ordinary calculus treats (dx)2 as negligible, while Itô calculus treats (dW)2 as a first-order contribution. Forgetting this distinction — say, computing d(lnSt)=dS/S without the −21σ2dt — is the single most common mistake in stochastic calculus, and the resulting pricing model will be off by 21σ2 per year (a 3% per-year drift error at 25% volatility).
"(dWt)2=dt is an algebraic identity." It is not. You are not squaring a random number and getting a deterministic one. The identity is shorthand for the L2 limit ∑(ΔWi)2→T under refinement. Inside an Itô calculation, treating (dWt)2 as dt is a valid manipulation because what you actually compute is an integral against dt, not a pointwise value. Outside that context the "identity" is meaningless.
"Itô's lemma applies to any continuous process." No. It requires an Itô process — an adapted integrand structure dX=adt+bdW with well-defined Itô integrals. For processes with jumps (Lévy, jump-diffusion), a generalised version with a compensator term is needed. For processes with infinite quadratic variation other than t (rough paths, fractional Brownian motion with H=1/2), Itô's lemma in this form fails and specialised calculi apply.
"The Itô correction sign is always negative." The correction is +21b2Vxx, which has the sign of Vxx. For V=lnS, VSS=−1/S2<0, giving a negative correction −21σ2. For V=S2, VSS=2>0, giving a positive correction. The sign of the correction is the sign of the convexity of the payoff — which is why convex payoffs (options) benefit from volatility and concave payoffs (short options) suffer.
"Itô's lemma gives the expectation of V(t,Xt)." It does not. It gives the SDE satisfied by V(t,Xt). Taking expectations and using that Itô integrals have mean zero gives the ODE dtdE[V]=E[Vt+aVx+21b2Vxx], but solving this ODE still requires knowing enough about the joint distribution of Xt — usually the Feynman-Kac formula or a direct distributional argument.
"Stratonovich calculus is a competing framework."Stratonovich integration, which uses a midpoint convention, does satisfy the ordinary chain rule. The catch for finance is that self-financing trading gains are naturally left-endpoint, non-anticipative objects, so Itô is the default pricing convention. Knowing the difference matters when reading non-finance stochastic-calculus references.
Where this goes next
Geometric Brownian Motion: The closed-form GBM solution is pure Itô — the −21σ2 is the Itô correction applied to lnS.
Stochastic Differential Equations: The integral-equation framework in which Itô's lemma is formally defined. Also supplies the existence and uniqueness theorems needed to apply it.
Black-Scholes PDE: The capstone application — Itô's lemma applied to V(t,St) plus a self-financing hedge yields the pricing PDE.
Girsanov's Theorem: Uses the exponential martingale from Example 3 to change Brownian drift under an absolutely continuous measure change.
Jump-Diffusion Processes: Generalises Itô's lemma to include jumps (Merton's jump-diffusion). The extra term is a compensator integral against the Poisson random measure.
References
Lawler, G. F. (2023). Stochastic Calculus: An Introduction with Applications. Ch. 3 §3.3 (Itô's formula), §3.4 (More versions of Itô's formula), especially Examples 3.3.1-3.3.2 and 3.4.1.
Albin, P., Hamza, K., & Klebaner, F. C. (2025). Problems and Solutions in Stochastic Calculus with Applications. World Scientific. Ch. 4 (Brownian Motion Calculus) — supporting exercise checks.
Exercises
Test your understanding with 3 exercises for this lesson.