Chain Rule
Motivation: why this matters in quant finance
The chain rule is the differentiation rule you use whenever one quantity depends on another, which in turn depends on a third. In quantitative finance this nesting is everywhere: an
option price
v depends on the stock price
S, which depends on time
t; a portfolio's P&L depends on the Greeks, which depend on model parameters; a yield curve depends on discount factors, which depend on short rates.
More concretely, the chain rule is the deterministic ancestor of
Itô's Lemma. When Black and Scholes needed to find
dv(S,t) where
v is the option price and
S follows a stochastic process, the starting point was a
Taylor expansion — which is really just the multivariable chain rule applied to
v(S(t),t). In the deterministic world, the chain rule gives the exact answer. In the stochastic world, the chain rule gives the
wrong answer because it drops the second-order term
(dWt)2=dt that
Brownian motion forces you to keep. Understanding the deterministic chain rule precisely is therefore a prerequisite for understanding
why and
where Itô's Lemma corrects it.
Definition and setup
Single-variable chain rule
Let f and g be differentiable functions. If y=f(g(x)), then the derivative of y with respect to x is:
dxdy=f′(g(x))⋅g′(x)
In Leibniz notation, if y=f(u) and u=g(x):
dxdy=dudy⋅dxdu
The idea is simple: to find how y changes with x, multiply the rate at which y changes with u by the rate at which u changes with x. Rates of change compose by multiplication.
Assumptions: Both
f and
g must be differentiable at the relevant points. If either function has a kink, a jump, or a vertical tangent, the chain rule does not apply there. This is precisely the issue with
Brownian motion: the sample paths are nowhere differentiable, so the ordinary chain rule cannot be used.
Multivariable chain rule
If f=f(x1,x2,…,xn) and each xi=xi(t) is a differentiable function of t, then:
dtdf=i=1∑n∂xi∂f⋅dtdxi
The most important special case in quant finance is f=f(x,t) where x=x(t):
dtdf=∂t∂f+∂x∂f⋅dtdx
Or, in differential notation:
df=ftdt+fxdx
This is the formula that
Itô's Lemma extends by adding the second-order correction
21fxx(dx)2 when
x contains a Brownian component.
Key results and properties
Composition of derivatives
The chain rule says that differentiation "distributes through composition." If h(x)=f(g(x)), then h′(x)=f′(g(x))⋅g′(x). This extends to any finite chain of compositions. For three functions:
dxdf(g(h(x)))=f′(g(h(x)))⋅g′(h(x))⋅h′(x)
Each link in the chain contributes one multiplicative factor. In quant finance, this multi-link chain appears when you differentiate through several layers of model transformation — for instance, computing the sensitivity of a portfolio value to a change in an underlying rate, passing through a yield curve model, a discount factor, and a pricing formula.
Inverse function derivative
A useful corollary: if y=f(x) is invertible and differentiable with f′(x)=0, then the inverse x=f−1(y) has derivative:
dydx=dy/dx1=f′(x)1
This is the chain rule applied to the identity f(f−1(y))=y.
Connection to total differentials
In differential form, the chain rule for
f(x,t) with
x=x(t) reads
df=ftdt+fxdx. This is a
total differential and is the starting point for the
Taylor expansion approach used in
the derivation of the Black-Scholes formula. The deterministic version terminates here; the stochastic version adds
21fxx(dx)2.
Examples and applications
Example 1: Differentiating the exponential of a linear function
In many pricing formulas, you encounter expressions of the form e−rT where r is the risk-free rate and T is time to maturity. Suppose you want the sensitivity of a discount factor to the rate.
Let D(r)=e−rT. This is a composition: D=f(g(r)) where g(r)=−rT (an inner linear function) and f(u)=eu (the outer exponential).
drdD=f′(g(r))⋅g′(r)=e−rT⋅(−T)=−Te−rT
The derivative is negative (higher rate means lower discount factor) and proportional to
T (longer maturities are more sensitive to rate changes). This quantity is the
duration of a zero-coupon bond, up to sign and normalisation, and it is the simplest example of interest rate risk. See
Discounting for more context.
Example 2: Delta of a transformed payoff
Suppose an option has payoff H(ST)=(ST2−K)+ at maturity (a "power option"). To compute the delta of the payoff with respect to the spot price, you need the chain rule. In the region where ST2>K (i.e., the option is in the money):
∂ST∂H=dSTd(ST2−K)=2ST
Here, H=f(g(ST)) with g(ST)=ST2−K and f(u)=u (identity in the ITM region). The chain rule gives f′(g)⋅g′(ST)=1⋅2ST=2ST. For a standard call with payoff (ST−K)+, the analogous calculation gives delta =1 in the ITM region — the linear payoff has constant slope.
Example 3: The multivariable chain rule in option pricing (deterministic case)
Consider an option value
v(S,t) where the stock price is a
deterministic function of time (no randomness). Then:
dtdv=∂t∂v+∂S∂v⋅dtdS
In the notation of the Greeks: dv=Θdt+ΔdS, where Θ=vt is theta (time decay) and Δ=vS is delta (price sensitivity). This is purely the chain rule.
When the stock price becomes stochastic (
dS=μSdt+σSdWt), this formula is no longer complete. The correct version is
Itô's Lemma:
dv=Θdt+ΔdS+21Γσ2S2dt
where
Γ=vSS is the second derivative (gamma). The extra gamma term is the Itô correction — the piece the ordinary chain rule misses. This correction is exactly what leads to the
Black-Scholes PDE.
Common confusions and pitfalls
"I can cancel the du's in dudy⋅dxdu." Notationally it looks like cancellation, and for single-variable smooth functions it gives the right answer. But
dy/du and
du/dx are
not fractions of infinitesimal numbers (at least not in standard analysis). This "cancellation" is really a theorem, not an algebraic tautology. The distinction matters when you move to stochastic calculus, where the analogous notation
dWt does
not behave like a fraction you can cancel — see the pitfalls section in
Brownian Motion.
Forgetting the inner derivative. The most common computational error is writing
dxdf(g(x))=f′(g(x)) and forgetting to multiply by
g′(x). For example,
dxdex2=ex2⋅2x, not
ex2.
Applying the chain rule when the function is not differentiable. The chain rule requires differentiability. Payoff functions like
(S−K)+ have a kink at
S=K and are not differentiable there. The derivative does not exist at the kink; you get a left-derivative and a right-derivative that disagree. In practice, this means the delta of a call option jumps discontinuously at the strike as expiration approaches.
Where this goes next
The chain rule is the first of the three core differentiation rules. The other two are the
product rule and the
quotient rule, which handle products and ratios of functions respectively.
In the stochastic setting, the ordinary chain rule becomes
Itô's Lemma — the corrected chain rule for functions of
Brownian motion. The derivation of the
Black-Scholes formula is built directly on this extension. Understanding
exactly what the deterministic chain rule says (and what it assumes) is the best way to understand
exactly what Itô's Lemma adds and why.
References
- Stewart, J. (2008). Single Variable Calculus: Early Transcendentals (6th ed.). Thomson Brooks/Cole. Ch. 3 Section 3.4 (The Chain Rule) for the single-variable rule and Leibniz interpretation.