Itô's Lemma

Motivation: why this matters in quant finance

Every derivative priced in modern finance — a European call, an interest-rate swap, a credit default swap, a barrier option — is a function of a stochastic process. If the underlying stock follows

dS_t = \mu S_t\,dt + \sigma S_t\,dW_t

, then the option price

V(t, S_t)

is a function of time and a diffusion. To compute

dV

, to construct a hedge, or to derive a pricing PDE, you need a chain rule.

The ordinary chain rule does not work. If you write down

V(t, S_t)

and apply

dV = V_t\,dt + V_S\,dS_t

, you will get the wrong answer in a way that ruins every calculation downstream. The reason is subtle but fatal: the increments of Brownian motion scale as

\sqrt{dt}

, not

dt

, so the squared increments

(dW_t)^2

contribute at order

dt

and do not vanish in the limit. A second-order term that ordinary calculus discards is, in stochastic calculus, a first-order correction.

Itô's lemma is the stochastic chain rule that keeps track of this correction. Written in its most useful form, for

X_t

satisfying

dX_t = a\,dt + b\,dW_t

and

V = V(t, X_t)

dV = \left(V_t + a\,V_x + \tfrac{1}{2}b^2 V_{xx}\right)dt + b\,V_x\,dW_t

The extra term

\tfrac{1}{2}b^2 V_{xx}\,dt

— absent from ordinary calculus — is the Itô correction. Every hallmark of the Black-Scholes framework routes through it: the

-\tfrac{1}{2}\sigma^2

in the log-return drift of geometric Brownian motion, the

\tfrac{1}{2}\sigma^2 S^2 V_{SS}

term in the Black-Scholes PDE, the gamma-theta balance in delta hedging. Lose the correction and you lose the entire pricing theory.

The informal idea

Why does ordinary calculus fail, and what exactly does Itô fix? The argument has three steps.

Step 1 — Taylor expansion. For a smooth

f

and small

\Delta x

f(x + \Delta x) - f(x) = f'(x)\Delta x + \tfrac{1}{2}f''(x)(\Delta x)^2 + O((\Delta x)^3)

In ordinary calculus, when $x(t)$ is smooth, the increment $\Delta x$ scales as $\Delta t$ , so $(\Delta x)^2 = O((\Delta t)^2)$ . When we divide by $\Delta t$ and take the limit, the second-order term vanishes. Only $f'(x)\,dx$ survives — this is the ordinary chain rule.

Step 2 — Brownian scale is different. For

X_t = W_t

, the increment

\Delta W

is a Gaussian with standard deviation

\sqrt{\Delta t}

. So

\Delta W = O(\sqrt{\Delta t})

, and:

(\Delta W)^2 = O(\Delta t)

The second-order term does not vanish — it has the same order as the deterministic term

\Delta t

. When we sum the Taylor expansion over many small intervals and take the limit, the

\tfrac{1}{2}f''(\Delta W)^2

contribution adds up to a finite, deterministic correction.

Step 3 — Quadratic variation pins down the correction. The lesson on Brownian motion established that

\sum (W_{t_{i+1}} - W_{t_i})^2 \to T

L^2

— the squared increments accumulate to the elapsed time. In differential notation,

(dW_t)^2 = dt

(as a rule for manipulating differentials, not as a pointwise algebraic identity). So the Taylor correction for

f(W_t)

becomes:

\tfrac{1}{2}f''(W_t)(dW_t)^2 = \tfrac{1}{2}f''(W_t)\,dt

This is the Itô correction. It is entirely a consequence of the $(dW)^2 = dt$ rule, which in turn is a consequence of $\operatorname{Var}(\Delta W) = \Delta t$ .

The canonical counterexample

Consider $Y_t = W_t^2$ . The ordinary chain rule would give $dY_t = 2W_t\,dW_t$ , from which $\mathbb{E}[dY_t] = 0$ (Itô integrals have mean zero), implying $\mathbb{E}[W_t^2]$ is constant. But we know $\mathbb{E}[W_t^2] = t$ — it grows linearly. Ordinary calculus is off by exactly $dt$ .

Itô's lemma applied to $f(x) = x^2$ (so $f'(x) = 2x$ , $f''(x) = 2$ ) gives:

dY_t = 2W_t\,dW_t + \tfrac{1}{2}\cdot 2 \cdot (dW_t)^2 = 2W_t\,dW_t + dt

Taking expectations: $\mathbb{E}[dY_t] = 0 + dt$ , so $\mathbb{E}[W_t^2] = t$ . ✓

The missing $dt$ is the Itô correction. Every apparent paradox involving " $d(\text{function of } W_t)$ " resolves the same way.

Formal definitions

Lawler presents Itô's formula first for

f(W_t)

and then extends it to time-dependent functions and diffusions. In the notation used across this vault, let

(W_t)_{t \ge 0}

be a standard Brownian motion on a filtered probability space, and let

(X_t)_{t \ge 0}

be an Itô process:

X_t = X_0 + \int_0^t a(s, \omega)\,ds + \int_0^t b(s, \omega)\,dW_s

where $a$ and $b$ are adapted processes satisfying the usual integrability conditions ( $\int_0^T |a_s|\,ds < \infty$ and $\int_0^T b_s^2\,ds < \infty$ almost surely). In differential notation:

dX_t = a_t\,dt + b_t\,dW_t

Itô's lemma (one-dimensional). Let

V(t, x): [0, \infty) \times \mathbb{R} \to \mathbb{R}

be a function with continuous partial derivatives

V_t

V_x

, and

V_{xx}

. Then the process

Y_t := V(t, X_t)

is itself an Itô process, and:

\boxed{\,dY_t = \left(V_t + a_t V_x + \tfrac{1}{2}b_t^2 V_{xx}\right)dt + b_t V_x\,dW_t\,}

where the partial derivatives are evaluated at $(t, X_t)$ . Equivalently, in integral form:

V(t, X_t) = V(0, X_0) + \int_0^t \left(V_s + a_s V_x + \tfrac{1}{2}b_s^2 V_{xx}\right)ds + \int_0^t b_s V_x\,dW_s

The first three terms form the drift; the last term is the diffusion. The Itô correction is the

\tfrac{1}{2}b^2 V_{xx}

term in the drift.

Differential rules used in practice

Itô's lemma is usually applied mechanically via the following multiplication rules, which encode $(dW_t)^2 = dt$ and the vanishing of all other second-order products:

⋅	$dt$	$dW_t$
$dt$	$0$	$0$
$dW_t$	$0$	$dt$

To apply Itô's lemma, write out the second-order Taylor expansion, substitute $dX_t = a\,dt + b\,dW_t$ , square it, and drop all terms that vanish by the table. What remains is the Itô formula.

Sketch of the proof

Partition $[0, t]$ into $0 = t_0 < t_1 < \cdots < t_n = t$ with mesh $\delta = \max_i(t_{i+1} - t_i)$ . Write:

V(t, X_t) - V(0, X_0) = \sum_{i=0}^{n-1} \big[V(t_{i+1}, X_{t_{i+1}}) - V(t_i, X_{t_i})\big]

Apply a two-dimensional Taylor expansion to each summand:

V(t_{i+1}, X_{t_{i+1}}) - V(t_i, X_{t_i}) = V_t\Delta t_i + V_x\Delta X_i + \tfrac{1}{2}V_{xx}(\Delta X_i)^2 + V_{tx}\Delta t_i \Delta X_i + \tfrac{1}{2}V_{tt}(\Delta t_i)^2 + \text{higher order}

with partial derivatives evaluated at $(t_i, X_{t_i})$ . Summing and taking $\delta \to 0$ :

$\sum V_t\Delta t_i \to \int_0^t V_t\,ds$ (standard Riemann integral).
$\sum V_x\Delta X_i \to \int_0^t V_x\,dX_s = \int_0^t a V_x\,ds + \int_0^t b V_x\,dW_s$ (Itô integral).
$\sum V_{xx}(\Delta X_i)^2 \to \int_0^t V_{xx} b^2\,ds$ (by quadratic variation $(dW)^2 = dt$ ; the cross-term $\Delta t_i\Delta X_i$ and the pure $(\Delta t_i)^2$ vanish).
Higher-order terms vanish.

Combining and halving the $V_{xx}$ term gives the Itô formula. A full proof replaces each heuristic step with an $L^2$ convergence argument; the structure above is the mnemonic every practitioner uses.

Key properties

It is a second-order chain rule

The ordinary chain rule keeps only the first derivative because smooth increments satisfy $(dx)^2=o(dt)$ . Brownian-driven increments satisfy $(dW_t)^2=dt$ in the quadratic-variation sense, so the second derivative contributes to drift.

It separates drift from innovation

After applying Itô's lemma, every transformed process has a

dt

part and a

dW_t

part. In finance, the $dW_t$ coefficient is the hedgeable shock exposure, while the $dt$ coefficient is the drift to be eliminated, priced, or interpreted.

Convexity controls the sign of the correction

The correction term

\frac{1}{2}b_t^2V_{xx}\,dt

has the sign of $V_{xx}$ . Convex functions receive positive drift from volatility; concave functions receive negative drift. This is the calculus behind gamma exposure and volatility drag.

Time-dependent functions add a separate theta term

For

V(t,X_t)

, the

V_t\,dt

term is not stochastic. In option pricing this is theta: the deterministic time-decay component that must balance gamma and financing terms in the Black-Scholes PDE.

Worked examples

Example 1 — The log of geometric Brownian motion

This is the single most-referenced application of Itô's lemma. Let $dS_t = \mu S_t\,dt + \sigma S_t\,dW_t$ (so $a = \mu S_t$ , $b = \sigma S_t$ ), and set $V(t, S) = \ln S$ . Then $V_t = 0$ , $V_S = 1/S$ , $V_{SS} = -1/S^2$ . Itô's lemma gives:

d(\ln S_t) = \left(0 + \mu S_t \cdot \tfrac{1}{S_t} + \tfrac{1}{2}\sigma^2 S_t^2 \cdot \left(-\tfrac{1}{S_t^2}\right)\right)dt + \sigma S_t \cdot \tfrac{1}{S_t}\,dW_t = \left(\mu - \tfrac{1}{2}\sigma^2\right)dt + \sigma\,dW_t

Integrate and exponentiate to get the closed-form GBM solution

S_t = S_0\exp((\mu - \tfrac{1}{2}\sigma^2)t + \sigma W_t)

. The

-\tfrac{1}{2}\sigma^2

is entirely from the Itô correction

\tfrac{1}{2}b^2 V_{xx} = \tfrac{1}{2}(\sigma S)^2 (-1/S^2) = -\tfrac{1}{2}\sigma^2

Example 2 — $W_t^2$ and $W_t^3$

Setting $V(W) = W^2$ : $V' = 2W$ , $V'' = 2$ . With $dX_t = dW_t$ (so $a = 0$ , $b = 1$ ):

d(W_t^2) = (0 + 0 + \tfrac{1}{2}\cdot 1 \cdot 2)\,dt + 1\cdot 2W_t\,dW_t = dt + 2W_t\,dW_t

Taking expectations: $\mathbb{E}[W_t^2] = 0 + \int_0^t 1\,ds = t$ , confirming the known variance.

Setting $V(W) = W^3$ : $V' = 3W^2$ , $V'' = 6W$ . Same $a = 0$ , $b = 1$ :

d(W_t^3) = \tfrac{1}{2}\cdot 6W_t\,dt + 3W_t^2\,dW_t = 3W_t\,dt + 3W_t^2\,dW_t

So $\mathbb{E}[W_t^3] = \int_0^t 3\mathbb{E}[W_s]\,ds = 0$ . (The mean-zero, symmetric structure of Brownian motion shows up as an automatic cancellation.)

Example 3 — Exponential martingale and the Doléans-Dade exponential

Let $V(t, W) = \exp(\sigma W - \tfrac{1}{2}\sigma^2 t)$ . Then $V_t = -\tfrac{1}{2}\sigma^2 V$ , $V_W = \sigma V$ , $V_{WW} = \sigma^2 V$ . With $dX_t = dW_t$ :

dV = \left(-\tfrac{1}{2}\sigma^2 V + 0 + \tfrac{1}{2}\sigma^2 V\right)dt + \sigma V\,dW_t = \sigma V\,dW_t

The drift cancels exactly.

V_t

has no drift, so

V_t

is a martingale — this is the exponential martingale, the foundation of Girsanov's theorem and the simplest non-trivial Brownian martingale. The cancellation is the Itô correction: the

+\tfrac{1}{2}\sigma^2

from

\tfrac{1}{2}b^2 V_{xx}

precisely offsets the

-\tfrac{1}{2}\sigma^2

V_t

Example 4 — Derivation of the Black-Scholes PDE

This is the payoff. Let the stock follow $dS_t = \mu S_t\,dt + \sigma S_t\,dW_t$ , and let $V(t, S_t)$ be the value of a derivative. Applying Itô:

dV = \left(V_t + \mu S V_S + \tfrac{1}{2}\sigma^2 S^2 V_{SS}\right)dt + \sigma S V_S\,dW_t

Now construct a self-financing hedge portfolio $\Pi = V - \Delta\cdot S$ where $\Delta = V_S$ . The stochastic terms cancel:

d\Pi = dV - V_S\,dS = \left(V_t + \tfrac{1}{2}\sigma^2 S^2 V_{SS}\right)dt

$\Pi$ is instantaneously riskless, so by no-arbitrage it must earn the risk-free rate $r$ :

d\Pi = r\Pi\,dt = r(V - S V_S)\,dt

Equating the two expressions for

d\Pi

yields the Black-Scholes PDE:

V_t + rSV_S + \tfrac{1}{2}\sigma^2 S^2 V_{SS} - rV = 0

The $\tfrac{1}{2}\sigma^2 S^2 V_{SS}$ term — the gamma-times-variance — is the Itô correction. Without it there is no PDE, no closed-form option price, and no delta hedging. The entire derivative-pricing edifice rests on this single $\tfrac{1}{2}b^2 V_{xx}$ .

Example 5 — Short numerical check

The Itô correction is not an abstraction — you can see it on your laptop in ten seconds.

# Python: d(W^2) = dt + 2W dW implies E[W_T^2] = T, not 0
import numpy as np

rng = np.random.default_rng(0)
T, n, N = 1.0, 1000, 100_000
dt = T / n
dW = rng.normal(0, np.sqrt(dt), size=(N, n))
W = np.cumsum(dW, axis=1)

# Naive "ordinary chain rule": d(W^2) = 2W dW, so W_T^2 = ∫2W dW has mean 0
naive_prediction = 0.0

# Itô: d(W^2) = dt + 2W dW, so W_T^2 = T + ∫2W dW has mean T
ito_prediction = T

empirical = (W[:, -1] ** 2).mean()
print(f"Ordinary-calculus prediction: {naive_prediction}")
print(f"Itô prediction:               {ito_prediction}")
print(f"Empirical mean:               {empirical:.4f}")
# Ordinary-calculus prediction: 0.0
# Itô prediction:               1.0
# Empirical mean:               1.0017

The empirical mean lands on the Itô prediction, not on zero. The Itô correction is the difference between a pricing model that works and one that doesn't.

Itô's lemma for multiple processes

In quant finance, many models have two or more Brownian drivers — stochastic volatility (Heston), multi-factor rates (two-factor HJM), multi-asset baskets. The multidimensional Itô formula generalises cleanly.

Let $X_t = (X_t^1, \ldots, X_t^n)$ be an $n$ -dimensional Itô process driven by an $m$ -dimensional Brownian motion $W_t = (W_t^1, \ldots, W_t^m)$ :

dX_t^i = a_t^i\,dt + \sum_{k=1}^m b_t^{ik}\,dW_t^k

For a twice-differentiable $V(t, x_1, \ldots, x_n)$ :

dV = V_t\,dt + \sum_i V_{x_i}\,dX_t^i + \tfrac{1}{2}\sum_{i, j}V_{x_i x_j}\,d[X^i, X^j]_t

where the quadratic covariation is:

d[X^i, X^j]_t = \sum_{k=1}^m b_t^{ik}b_t^{jk}\,dt

When the driving Brownians are correlated with $d[W^k, W^\ell]_t = \rho_{k\ell}\,dt$ , the covariation formula generalises to $d[X^i, X^j]_t = \sum_{k, \ell} b_t^{ik}b_t^{j\ell}\rho_{k\ell}\,dt$ . This is how the Heston model, the SABR model, and the Margrabe exchange-option formula are derived.

Common confusions and pitfalls

"Itô's lemma is just the chain rule with an extra term." It is and it isn't. The form looks like a chain rule, but the content is different: ordinary calculus treats

(dx)^2

as negligible, while Itô calculus treats

(dW)^2

as a first-order contribution. Forgetting this distinction — say, computing

d(\ln S_t) = dS/S

without the

-\tfrac{1}{2}\sigma^2\,dt

— is the single most common mistake in stochastic calculus, and the resulting pricing model will be off by

\tfrac{1}{2}\sigma^2

per year (a 3% per-year drift error at 25% volatility).

" $(dW_t)^2 = dt$ is an algebraic identity." It is not. You are not squaring a random number and getting a deterministic one. The identity is shorthand for the

L^2

limit

\sum(\Delta W_i)^2 \to T

under refinement. Inside an Itô calculation, treating

(dW_t)^2

dt

is a valid manipulation because what you actually compute is an integral against

dt

, not a pointwise value. Outside that context the "identity" is meaningless.

"Itô's lemma applies to any continuous process." No. It requires an Itô process — an adapted integrand structure

dX = a\,dt + b\,dW

with well-defined Itô integrals. For processes with jumps (Lévy, jump-diffusion), a generalised version with a compensator term is needed. For processes with infinite quadratic variation other than

t

(rough paths, fractional Brownian motion with

H \ne 1/2

), Itô's lemma in this form fails and specialised calculi apply.

"The Itô correction sign is always negative." The correction is

+\tfrac{1}{2}b^2 V_{xx}

, which has the sign of

V_{xx}

. For

V = \ln S

V_{SS} = -1/S^2 < 0

, giving a negative correction

-\tfrac{1}{2}\sigma^2

. For

V = S^2

V_{SS} = 2 > 0

, giving a positive correction. The sign of the correction is the sign of the convexity of the payoff — which is why convex payoffs (options) benefit from volatility and concave payoffs (short options) suffer.

"Itô's lemma gives the expectation of $V(t, X_t)$ ." It does not. It gives the SDE satisfied by

V(t, X_t)

. Taking expectations and using that Itô integrals have mean zero gives the ODE

\tfrac{d}{dt}\mathbb{E}[V] = \mathbb{E}[V_t + a V_x + \tfrac{1}{2}b^2 V_{xx}]

, but solving this ODE still requires knowing enough about the joint distribution of

X_t

— usually the Feynman-Kac formula or a direct distributional argument.

"Stratonovich calculus is a competing framework." Stratonovich integration, which uses a midpoint convention, does satisfy the ordinary chain rule. The catch for finance is that self-financing trading gains are naturally left-endpoint, non-anticipative objects, so Itô is the default pricing convention. Knowing the difference matters when reading non-finance stochastic-calculus references.

Where this goes next

Geometric Brownian Motion: The closed-form GBM solution is pure Itô — the $-\tfrac{1}{2}\sigma^2$ is the Itô correction applied to $\ln S$ .
Stochastic Differential Equations: The integral-equation framework in which Itô's lemma is formally defined. Also supplies the existence and uniqueness theorems needed to apply it.
Black-Scholes PDE: The capstone application — Itô's lemma applied to $V(t, S_t)$ plus a self-financing hedge yields the pricing PDE.
Girsanov's Theorem: Uses the exponential martingale from Example 3 to change Brownian drift under an absolutely continuous measure change.
Infinitesimal Generators and Kolmogorov Equations: Packages the Itô drift operator into the PDE machinery behind Feynman-Kac.
Jump-Diffusion Processes: Generalises Itô's lemma to include jumps (Merton's jump-diffusion). The extra term is a compensator integral against the Poisson random measure.

References

Lawler, G. F. (2023). Stochastic Calculus: An Introduction with Applications. Ch. 3 §3.3 (Itô's formula), §3.4 (More versions of Itô's formula), especially Examples 3.3.1-3.3.2 and 3.4.1.
Albin, P., Hamza, K., & Klebaner, F. C. (2025). Problems and Solutions in Stochastic Calculus with Applications. World Scientific. Ch. 4 (Brownian Motion Calculus) — supporting exercise checks.