CONTENTS

Chain Rule

Motivation: why this matters in quant finance

The chain rule is the differentiation rule you use whenever one quantity depends on another, which in turn depends on a third. In quantitative finance this nesting is everywhere: an option price vv depends on the stock price SS, which depends on time tt; a portfolio's P&L depends on the Greeks, which depend on model parameters; a yield curve depends on discount factors, which depend on short rates.
More concretely, the chain rule is the deterministic ancestor of Itô's Lemma. When Black and Scholes needed to find dv(S,t)dv(S, t) where vv is the option price and SS follows a stochastic process, the starting point was a Taylor expansion — which is really just the multivariable chain rule applied to v(S(t),t)v(S(t), t). In the deterministic world, the chain rule gives the exact answer. In the stochastic world, the chain rule gives the wrong answer because it drops the second-order term (dWt)2=dt(dW_t)^2 = dt that Brownian motion forces you to keep. Understanding the deterministic chain rule precisely is therefore a prerequisite for understanding why and where Itô's Lemma corrects it.

Definition and setup

Single-variable chain rule

Let ff and gg be differentiable functions. If y=f(g(x))y = f(g(x)), then the derivative of yy with respect to xx is:

dydx=f(g(x))g(x)\frac{dy}{dx} = f'(g(x)) \cdot g'(x)

In Leibniz notation, if y=f(u)y = f(u) and u=g(x)u = g(x):

dydx=dydududx\frac{dy}{dx} = \frac{dy}{du} \cdot \frac{du}{dx}

The idea is simple: to find how yy changes with xx, multiply the rate at which yy changes with uu by the rate at which uu changes with xx. Rates of change compose by multiplication.

Assumptions: Both ff and gg must be differentiable at the relevant points. If either function has a kink, a jump, or a vertical tangent, the chain rule does not apply there. This is precisely the issue with Brownian motion: the sample paths are nowhere differentiable, so the ordinary chain rule cannot be used.

Multivariable chain rule

If f=f(x1,x2,,xn)f = f(x_1, x_2, \dots, x_n) and each xi=xi(t)x_i = x_i(t) is a differentiable function of tt, then:

dfdt=i=1nfxidxidt\frac{df}{dt} = \sum_{i=1}^{n} \frac{\partial f}{\partial x_i} \cdot \frac{dx_i}{dt}

The most important special case in quant finance is f=f(x,t)f = f(x, t) where x=x(t)x = x(t):

dfdt=ft+fxdxdt\frac{df}{dt} = \frac{\partial f}{\partial t} + \frac{\partial f}{\partial x} \cdot \frac{dx}{dt}

Or, in differential notation:

df=ftdt+fxdxdf = f_t\,dt + f_x\,dx
This is the formula that Itô's Lemma extends by adding the second-order correction 12fxx(dx)2\frac{1}{2}f_{xx}(dx)^2 when xx contains a Brownian component.

Key results and properties

Composition of derivatives

The chain rule says that differentiation "distributes through composition." If h(x)=f(g(x))h(x) = f(g(x)), then h(x)=f(g(x))g(x)h'(x) = f'(g(x)) \cdot g'(x). This extends to any finite chain of compositions. For three functions:

ddxf(g(h(x)))=f(g(h(x)))g(h(x))h(x)\frac{d}{dx} f(g(h(x))) = f'(g(h(x))) \cdot g'(h(x)) \cdot h'(x)

Each link in the chain contributes one multiplicative factor. In quant finance, this multi-link chain appears when you differentiate through several layers of model transformation — for instance, computing the sensitivity of a portfolio value to a change in an underlying rate, passing through a yield curve model, a discount factor, and a pricing formula.

Inverse function derivative

A useful corollary: if y=f(x)y = f(x) is invertible and differentiable with f(x)0f'(x) \neq 0, then the inverse x=f1(y)x = f^{-1}(y) has derivative:

dxdy=1dy/dx=1f(x)\frac{dx}{dy} = \frac{1}{dy/dx} = \frac{1}{f'(x)}

This is the chain rule applied to the identity f(f1(y))=yf(f^{-1}(y)) = y.

Connection to total differentials

In differential form, the chain rule for f(x,t)f(x, t) with x=x(t)x = x(t) reads df=ftdt+fxdxdf = f_t\,dt + f_x\,dx. This is a total differential and is the starting point for the Taylor expansion approach used in the derivation of the Black-Scholes formula. The deterministic version terminates here; the stochastic version adds 12fxx(dx)2\frac{1}{2}f_{xx}(dx)^2.

Examples and applications

Example 1: Differentiating the exponential of a linear function

In many pricing formulas, you encounter expressions of the form erTe^{-rT} where rr is the risk-free rate and TT is time to maturity. Suppose you want the sensitivity of a discount factor to the rate.

Let D(r)=erTD(r) = e^{-rT}. This is a composition: D=f(g(r))D = f(g(r)) where g(r)=rTg(r) = -rT (an inner linear function) and f(u)=euf(u) = e^u (the outer exponential).

dDdr=f(g(r))g(r)=erT(T)=TerT\frac{dD}{dr} = f'(g(r)) \cdot g'(r) = e^{-rT} \cdot (-T) = -T e^{-rT}
The derivative is negative (higher rate means lower discount factor) and proportional to TT (longer maturities are more sensitive to rate changes). This quantity is the duration of a zero-coupon bond, up to sign and normalisation, and it is the simplest example of interest rate risk. See Discounting for more context.

Example 2: Delta of a transformed payoff

Suppose an option has payoff H(ST)=(ST2K)+H(S_T) = (S_T^2 - K)^+ at maturity (a "power option"). To compute the delta of the payoff with respect to the spot price, you need the chain rule. In the region where ST2>KS_T^2 > K (i.e., the option is in the money):

HST=ddST(ST2K)=2ST\frac{\partial H}{\partial S_T} = \frac{d}{dS_T}(S_T^2 - K) = 2S_T

Here, H=f(g(ST))H = f(g(S_T)) with g(ST)=ST2Kg(S_T) = S_T^2 - K and f(u)=uf(u) = u (identity in the ITM region). The chain rule gives f(g)g(ST)=12ST=2STf'(g) \cdot g'(S_T) = 1 \cdot 2S_T = 2S_T. For a standard call with payoff (STK)+(S_T - K)^+, the analogous calculation gives delta =1= 1 in the ITM region — the linear payoff has constant slope.

Example 3: The multivariable chain rule in option pricing (deterministic case)

Consider an option value v(S,t)v(S, t) where the stock price is a deterministic function of time (no randomness). Then:
dvdt=vt+vSdSdt\frac{dv}{dt} = \frac{\partial v}{\partial t} + \frac{\partial v}{\partial S} \cdot \frac{dS}{dt}

In the notation of the Greeks: dv=Θdt+ΔdSdv = \Theta\,dt + \Delta\,dS, where Θ=vt\Theta = v_t is theta (time decay) and Δ=vS\Delta = v_S is delta (price sensitivity). This is purely the chain rule.

When the stock price becomes stochastic (dS=μSdt+σSdWtdS = \mu S\,dt + \sigma S\,dW_t), this formula is no longer complete. The correct version is Itô's Lemma:
dv=Θdt+ΔdS+12Γσ2S2dtdv = \Theta\,dt + \Delta\,dS + \frac{1}{2}\Gamma\,\sigma^2 S^2\,dt
where Γ=vSS\Gamma = v_{SS} is the second derivative (gamma). The extra gamma term is the Itô correction — the piece the ordinary chain rule misses. This correction is exactly what leads to the Black-Scholes PDE.

Common confusions and pitfalls

"I can cancel the dudu's in dydududx\frac{dy}{du} \cdot \frac{du}{dx}." Notationally it looks like cancellation, and for single-variable smooth functions it gives the right answer. But dy/dudy/du and du/dxdu/dx are not fractions of infinitesimal numbers (at least not in standard analysis). This "cancellation" is really a theorem, not an algebraic tautology. The distinction matters when you move to stochastic calculus, where the analogous notation dWtdW_t does not behave like a fraction you can cancel — see the pitfalls section in Brownian Motion.
Forgetting the inner derivative. The most common computational error is writing ddxf(g(x))=f(g(x))\frac{d}{dx}f(g(x)) = f'(g(x)) and forgetting to multiply by g(x)g'(x). For example, ddxex2=ex22x\frac{d}{dx}e^{x^2} = e^{x^2} \cdot 2x, not ex2e^{x^2}.
Applying the chain rule when the function is not differentiable. The chain rule requires differentiability. Payoff functions like (SK)+(S - K)^+ have a kink at S=KS = K and are not differentiable there. The derivative does not exist at the kink; you get a left-derivative and a right-derivative that disagree. In practice, this means the delta of a call option jumps discontinuously at the strike as expiration approaches.

Where this goes next

The chain rule is the first of the three core differentiation rules. The other two are the product rule and the quotient rule, which handle products and ratios of functions respectively.
In the stochastic setting, the ordinary chain rule becomes Itô's Lemma — the corrected chain rule for functions of Brownian motion. The derivation of the Black-Scholes formula is built directly on this extension. Understanding exactly what the deterministic chain rule says (and what it assumes) is the best way to understand exactly what Itô's Lemma adds and why.

References

  • Stewart, J. (2008). Single Variable Calculus: Early Transcendentals (6th ed.). Thomson Brooks/Cole. Ch. 3 Section 3.4 (The Chain Rule) for the single-variable rule and Leibniz interpretation.
Chain Rule | q4quant.studio