There are two fundamentally different derivations of the Black-Scholes formula. The probabilistic route, already covered in the vault's main BS lesson, uses risk-neutral expectation: V0=e−rTEQ[payoff]. This route generalises cleanly to Monte Carlo, path-dependent payoffs, and stochastic rates.
The PDE route — this note — derives Black-Scholes from a hedging argument that turns the option valuation problem into a second-order parabolic PDE:
∂t∂V+21σ2S2∂S2∂2V+rS∂S∂V−rV=0.
This is the original (1973) Black-Scholes-Merton derivation. Its advantages:
Generalises to any European-style derivative with the same diffusion model: the PDE is derived once, and different boundary conditions give different products (call, put, cash-or-nothing, asset-or-nothing).
Natural for finite-difference numerical methods: American options, barrier options, and early-exercise features fit into PDE frameworks far more cleanly than into Monte Carlo.
Shows the hedging structure explicitly: the drift μ disappears because the replicating argument eliminates it.
This note walks through the derivation and sets up the PDE for subsequent finite-difference lessons.
The informal idea
Construct a portfolio that is instantaneously riskless by combining the option with a short position in the right number of shares. If the portfolio is riskless, it must earn the risk-free rate (no-arbitrage). Compare the resulting expression with Itô's expansion of the option's dynamics, match coefficients, and out pops the PDE.
The key insight: a "riskless portfolio" means the dW term vanishes. To make that happen, you choose the hedge ratio Δ=∂V/∂S. Everything else — the dt part, time-decay, gamma — falls into the PDE.
Formal derivation
Setup
Assume the underlying follows geometric Brownian motion:
dS=μSdt+σSdW.
Let V(S,t) be the value of a derivative depending on S and time t. Assume V∈C2,1 (twice differentiable in S, once in t).
Step 1: Itô expansion of V
By Itô's lemma,
dV=∂t∂Vdt+∂S∂VdS+21∂S2∂2V(dS)2.
Using (dS)2=σ2S2dt:
dV=[∂t∂V+μS∂S∂V+21σ2S2∂S2∂2V]dt+σS∂S∂VdW.
Step 2: Construct the hedged portfolio
Let Πt:=V−ΔtS with Δt:=∂V/∂S (a function of S,t). The dynamics:
Choosing Δ=∂V/∂S kills the stochastic term. The portfolio is locally riskless.
Subtle technical point. The hedge ratio Δt changes over time, so dΠ should in principle include terms from dΔ⋅dS and dΔ itself. The derivation above treats Δ as instantaneously fixed (so d(ΔS)=ΔdS+SdΔ). The shortcut works here because we choose Δafter taking the differential, not before; the dΔ terms contribute only to rebalancing and are cancelled by self-financing. A rigorous version uses the self-financing condition explicitly.
Step 3: No-arbitrage requires Π to earn r
A riskless portfolio must earn the risk-free rate (else arbitrage). So
dΠ=rΠdt=r(V−ΔS)dt.
Equating the two expressions for dΠ:
∂t∂V+21σ2S2∂S2∂2V=r(V−∂S∂VS).
Rearranging:
∂t∂V+21σ2S2∂S2∂2V+rS∂S∂V−rV=0.
This is the Black-Scholes PDE.
Step 4: Terminal and boundary conditions
The PDE applies for (S,t)∈(0,∞)×[0,T). To pin down V, supply boundary conditions at the boundaries of the domain.
Terminal: at t=T, the option value equals the payoff:
European call: V(S,T)=max(S−K,0).
European put: V(S,T)=max(K−S,0).
Asymptotic (as S→0):
Call: V(0,t)=0 (a worthless stock gives a worthless call).
Put: V(0,t)=Ke−r(T−t) (deeply-ITM; exercise value minus discounting).
Asymptotic (as S→∞):
Call: V(S,t)→S−Ke−r(T−t) (deep ITM; the call is nearly equivalent to a forward).
Put: V(S,t)→0.
Step 5: The PDE admits the Black-Scholes formula as its unique solution (for the call)
Substitute V(S,t)=SΦ(d1)−Ke−r(T−t)Φ(d2) into the PDE and verify. The verification is algebra-intensive but mechanical — use the derivatives Δ=Φ(d1), Γ=φ(d1)/(SσT−t), Θ=−Sφ(d1)σ/(2T−t)−rKe−r(T−t)Φ(d2), and check that they satisfy Θ+21σ2S2Γ+rSΔ−rV=0. This is Exercise 2.
Key properties
The drift μ does not appear. This is the signature Black-Scholes result: replication eliminates directional-drift dependence. Two assets with the same volatility but different drifts have the same option value — under Black-Scholes assumptions.
The PDE is parabolic. Second-order in S, first-order in t. Parabolic PDEs have a unique solution for the forward problem (given initial data) and also for the backward problem used here (given terminal data) thanks to time reversal.
Feynman-Kac connection. The same PDE arises from the Feynman-Kac formula applied to the risk-neutral expectation V=e−r(T−t)EQ[payoff∣St]. So the PDE and probabilistic derivations are rigorously equivalent — two views of the same problem.
Generalisation. Dividend yield q modifies to (r−q)S∂V/∂S. Time-varying r and σ keep the PDE form but require numerical methods. American options become a free-boundary problem because early-exercise adds an inequality.
Transformation to the heat equation
Let τ=T−t (time to expiry), x=lnS, and u(x,τ)=erτV(ex,T−τ). Substituting into the Black-Scholes PDE:
∂τ∂u=21σ2∂x2∂2u+(r−21σ2)∂x∂u.
A further change u=v⋅exp(ax+bτ) with suitable a,b eliminates the drift term, reducing the PDE to the heat equation
∂τ∂v=21σ2∂x2∂2v.
This is how Black, Scholes, and Merton originally derived the closed form: reduce to heat equation, use the known Gaussian fundamental solution.
Worked example — verifying the PDE for a call
Set r=0.05, σ=0.20, T=1, S=100, K=100, t=0.5. Compute:
"The drift μ disappears, so drift doesn't matter." It matters for the real-world probability of ITM-ness, for expected returns, for risk. What disappears is its role in the pricing — because the replication argument hedges away direction. A different μ doesn't change the option's price but does change how much you actually expect to make trading it.
Risk-neutral Q in PDE land. The PDE has no μ because the argument implicitly evaluates expectations under Q, where EQ[dS/S]=rdt. This is the same underlying change of measure, viewed PDE-style.
Only works for complete markets. The derivation assumes every option can be replicated by trading the underlying and cash. In incomplete markets (stochastic vol, jumps), no riskless hedge exists and the argument breaks — the PDE becomes an inequality (super-replication) or requires an extra pricing kernel.
PDE numerics require care at the boundaries. The Black-Scholes PDE is posed on S∈(0,∞). For finite-difference methods you truncate to S∈[0,Smax] with Smax chosen so the error at the boundary is small. Too-small Smax destroys accuracy.
The argument is formal, not rigorous. The step "choose Δ after taking the differential" sweeps subtlety under the rug. A full self-financing derivation (via the stochastic integral) is cleaner but longer.