Brownian motion (also called the Wiener process) is the continuous-time random process at the heart of nearly every pricing, hedging, and risk model in quantitative finance. When Black and Scholes wrote down the assumption that a stock price follows a geometric Brownian motion, they were choosing Brownian motion as the source of randomness — the engine that drives price uncertainty forward in time.
The reason is not that markets literally move according to a Wiener process. They don't. The reason is threefold:
Brownian motion is the scaling limit of a random walk. The discrete tick-by-tick movements of a price, when aggregated over many small time steps, converge (via Donsker's theorem) to Brownian motion. This gives the model a solid statistical foundation.
The mathematics works. Brownian motion is the unique process that is continuous, has independent and stationary Gaussian increments, and starts at zero. That combination makes it tractable enough to support an entire calculus — Itô calculus — which in turn makes it possible to derive closed-form or semi-closed-form solutions for derivative prices.
It separates drift from noise. In the stochastic differential equation framework, Brownian motion supplies the unpredictable component (dWt), while a deterministic drift (μdt) supplies the trend. This clean separation is exactly what you need to construct hedging arguments, change probability measures, and arrive at risk-neutral pricing.
Lawler describes Brownian motion as random continuous motion. There are two equivalent mental pictures. At each fixed time t, Wt is a random variable. Across all times, t↦Wt(ω) is a random function: one possible continuous path of accumulated shocks.
The process is the continuous limit of a random walk when time steps shrink like Δt and space steps shrink like Δt. That scaling keeps variance finite and makes the limiting increments Gaussian. The paths become continuous, but not smooth; their roughness is exactly what later forces quadratic variation and Itô's lemma.
Formal definitions
Before defining Brownian motion, we need the probability infrastructure on which it lives. This section is deliberately concise; the details are in Probability Space.
We work on a filtered probability space(Ω,F,{Ft}t≥0,P), where:
Ω is the sample space — the set of all possible paths the world can take.
F is the σ-algebra of all events we can assign probabilities to.
{Ft}t≥0 is a filtration: a growing family of σ-algebras with Fs⊆Ft for s≤t. Intuitively, Ft represents everything that is known (observable) up to time t. As time passes, information only accumulates — it never disappears.
P is the probability measure.
A stochastic process (Xt)t≥0 is adapted to the filtration if, for every t, the random variableXt is Ft-measurable. In plain terms: you can determine the value of Xt using only information available at time t, without peeking into the future. Brownian motion is always assumed to be adapted to its natural filtration (the smallest filtration generated by its own history).
We also impose the usual conditions: the filtration is right-continuous (Ft=Ft+) and F0 contains all P-null sets. These are technical but standard; they ensure that stopping times, martingale theorems, and Itô calculus work cleanly.
Standard Brownian motion
A stochastic process (Wt)t≥0 defined on (Ω,F,{Ft}t≥0,P) is a standard Brownian motion (or Wiener process) if the following four axioms hold:
(BM1) Initial condition:
W0=0almost surely
(BM2) Independent increments:
For any 0≤t0<t1<⋯<tn, the increments Wt1−Wt0,Wt2−Wt1,…,Wtn−Wtn−1
are mutually independent. Knowing where the path has been tells you nothing about the direction of its next move.
(BM3) Gaussian increments with variance equal to elapsed time:
For any 0≤s<t, Wt−Ws∼N(0,t−s)
The increment is normally distributed with mean zero and variance t−s. In particular, Wt∼N(0,t).
(BM4) Continuous paths:
The map t↦Wt(ω) is continuous for almost every ω∈Ω.
These four properties uniquely determine the law of the process (up to modification on null sets). Everything else — the Markov property, the martingale property, quadratic variation, nowhere differentiability — follows as a consequence.
Immediate consequences of the axioms
From the definition alone, several useful facts are immediate.
Moments:
E[Wt]=0,Var(Wt)=t,E[Wt2]=t
Covariance:
For s≤t, write Wt=Ws+(Wt−Ws). Since Ws is Fs-measurable and Wt−Ws is independent of Fs with mean zero:
This covariance structure completely characterises a Gaussian process, so it is an equivalent way to specify Brownian motion among Gaussian processes.
Core properties
Stationary increments
The distribution of Wt−Ws depends only on the length of the interval t−s, not on when the interval starts. This is already implicit in (BM3), but it is worth naming explicitly: increments are stationary. In finance, this means the "noise structure" of the model is the same whether you look at the first hour of trading or the last.
The t scaling (self-similarity)
Standard Brownian motion has a powerful scaling property. For any constant c>0, define:
W~t=c1Wct
Then (W~t)t≥0 is again a standard Brownian motion. You can verify this by checking the four axioms: the process starts at zero, has independent Gaussian increments with the correct variance, and has continuous paths.
This self-similarity means that Brownian motion looks statistically identical at every time scale — zooming into a small segment of a Brownian path produces a picture that is indistinguishable (in distribution) from the whole path. In finance, this is connected to the assumption that volatility scales as σΔt: the standard deviation of the increment Wt+Δt−Wt is Δt, and when you multiply by volatility σ and look at log-returns, you get the familiar σΔt scaling.
Markov property
Brownian motion is a Markov process: given the present valueWt, the future (Wu)u≥t is independent of the past (Ws)s≤t. Formally:
P(Wu∈A∣Ft)=P(Wu∈A∣Wt)for all u≥t
This follows directly from the independent increments property. The entire future evolution is determined (in law) by where the process is now — not by how it got there. In finance, this is related to the efficient market hypothesis: if the market is efficient, then the current price already incorporates all past information, and only the present price matters for forecasting the distribution of future prices.
Martingale property
(Wt)t≥0 is a martingale with respect to its natural filtration. For s≤t:
The second equality uses the fact that Wt−Ws is independent of Fs, and the third uses E[Wt−Ws]=0.
This is the mathematical expression of "fairness": given everything you know up to time s, your best forecast of Wt is simply Ws. No drift, no predictable direction. In the risk-neutral pricing framework, it is precisely this martingale property (applied to discounted prices) that encodes the no-arbitrage condition. See Martingale I for the full story.
A useful related result: the process Wt2−t is also a martingale. This can be shown by direct computation:
This result is more than a curiosity — it is closely connected to quadratic variation and is used in proving properties of Itô integrals.
Gaussianity
Brownian motion is a Gaussian process: every finite collection (Wt1,Wt2,…,Wtn) is jointly normally distributed. The joint distribution is fully specified by the mean vector (all zeros) and the covariance matrix with entries Cov(Wti,Wtj)=min(ti,tj). This is an extremely strong property. It means that all marginal distributions, all conditional distributions, and all finite-dimensional projections of Brownian motion are Gaussian.
Path properties
The paths of Brownian motion — the actual functions t↦Wt(ω) for a given outcome ω — have surprising and important properties that distinguish stochastic calculus from ordinary calculus.
Continuity
By axiom (BM4), almost every sample path of Brownian motion is a continuous function of time. There are no jumps. This is what makes the continuous hedging argument in the Black-Scholes PDE possible: the stock price (modelled as a function of Wt) moves continuously, so you can continuously adjust your hedge without being caught off guard by a sudden discontinuity.
In reality, of course, prices do jump (earnings announcements, flash crashes). This is one of the known limitations of Brownian-based models and motivates extensions such as jump-diffusion processes.
Nowhere differentiability
Despite being continuous everywhere, Brownian motion is differentiable nowhere (almost surely). Informally, the path is infinitely "jagged" — no matter how far you zoom in, it never smooths out into something with a well-defined slope.
The intuition is rooted in the scaling of increments. Over a small time interval Δt, the typical size of the increment is:
∣Wt+Δt−Wt∣∼Δt
If you try to form a "derivative" by computing the difference quotient:
ΔtWt+Δt−Wt∼ΔtΔt=Δt1→∞as Δt→0
The ratio blows up. The path wiggles too violently for a derivative to exist. This is not a technicality — it is the fundamental reason why ordinary calculus (the chain rule, the product rule) cannot be applied to functions of Brownian motion, and why Itô's Lemma is necessary.
Infinite total variation
The total variation of a function f on [0,T] is the supremum over all partitions of:
TV(f;[0,T])={ti}supi∑∣f(ti+1)−f(ti)∣
For smooth functions, total variation is finite and equals the integral of ∣f′(t)∣. For Brownian motion, total variation is infinite almost surely on every interval, no matter how small. Intuitively: the path oscillates so relentlessly that summing up the absolute sizes of its moves produces an infinite total.
This has a direct consequence: you cannot define a pathwise Riemann-Stieltjes integral ∫0Tf(t)dWt in the classical sense. The integrator Wt has too much variation. This is why stochastic integration requires its own theory (Itô integration), built on L2limits rather than pointwise approximation.
Quadratic variation
While total (first-order) variation is infinite, the quadratic variation of Brownian motion is finite and deterministic. For a partition 0=t0<t1<⋯<tn=T with mesh maxi(ti+1−ti)→0:
[W]T=n→∞limi=0∑n−1(Wti+1−Wti)2=T
The convergence is in L2 and also in probability. This is often written in differential notation as:
[W]t=tor equivalently(dWt)2=dt
Quadratic variation is the single most important concept bridging Brownian motion to Itô calculus. In ordinary calculus, (dx)2 is negligible because smooth functions have increments of order Δt, so (Δx)2=O(Δt2)→0. But for Brownian motion, ΔW=O(Δt), so (ΔW)2=O(Δt) — it does not vanish. This non-vanishing second-order term is precisely why the Taylor expansion of f(Wt) retains the 21f′′ term, and that extra term is the hallmark of Itô's Lemma.
Proof sketch (convergence in L2):
Define Qn=∑i=0n−1(Wti+1−Wti)2. Each squared increment (Wti+1−Wti)2 has mean Δti=ti+1−ti and variance 2(Δti)2 (since if Z∼N(0,σ2), then Var(Z2)=2σ4). By independence of increments:
So Qn→T in L2, hence in probability. The sum of squared increments converges to the elapsed time.
Link to Itô calculus: why ordinary calculus fails
Everything above converges on a single conclusion: classical calculus is not equipped to handle Brownian motion.
The core issue is always the same. If f is a smooth function and Xt is a smooth function of time, then the chain rule gives df(Xt)=f′(Xt)dXt, and you can drop all higher-order terms in a Taylor expansion. But if Xt involves Brownian motion, the increment dXt has a component of order dt, and so (dXt)2 has a component of order dt that survives in the limit.
Concretely, for a general Itô process:
dXt=a(t,Xt)dt+b(t,Xt)dWt
the correct chain rule for f(t,Xt) is Itô's Lemma:
df=∂t∂fdt+∂x∂fdXt+21∂x2∂2fb2(t,Xt)dt
The extra 21fxxb2dt term — absent in ordinary calculus — is entirely due to the non-zero quadratic variation of Wt. This is the "Itô correction" and it has far-reaching consequences: it is what produces the −21σ2 drift correction in the log-return distribution, it is what makes the Black-Scholes PDE differ from a simple transport equation, and it is why hedging requires continuous rebalancing.
For the full derivation and worked examples, see Itô's Lemma.
Quant-finance applications
From Brownian motion to geometric Brownian motion
The standard model for a stock price assumes:
dSt=μStdt+σStdWt
where μ is the drift (expected return) and σ is the volatility. This is the geometric Brownian motion (GBM) SDE. The key feature is that both drift and diffusion are proportional to St, which ensures St>0 and means that percentage changes (not absolute changes) are driven by Brownian motion.
Why this particular SDE? The reasoning is:
Multiplicative noise (σStdWt) means a stock at $100 and a stock at $10 experience the same proportional randomness, which matches how returns behave empirically.
The random walk of log-prices in discrete time (the multiplicative random walk) converges in the scaling limit to exactly this SDE.
The −21σ2 correction is purely a consequence of Itô calculus. If you naively applied the ordinary chain rule (ignoring the second-order term), you would get d(lnSt)=μdt+σdWt and conclude that log-returns have drift μ. But that is wrong — the correct drift is μ−21σ2. This difference matters enormously: it determines the expected growth rate of wealth, the calibration of risk-neutral drift, and the correct form of the Black-Scholes PDE.
Risk-neutral measure intuition
Under the real-world (physical) measure P, the stock has drift μ:
dSt=μStdt+σStdWtP
But for pricing derivatives, we need the discounted price e−rtSt to be a martingale. Under P, the discounted price has drift μ−r, which is generally nonzero (investors demand a risk premium). So discounted prices are notmartingales under P.
The solution is to change the probability measure from P to a risk-neutral measure Q. Under Q, the stock dynamics become:
dSt=rStdt+σStdWtQ
where WtQ is a Brownian motion under Q. The drift μ has been replaced by the risk-free rate r, and now e−rtSt is a Q-martingale. This is the content of the Fundamental Theorem of Asset Pricing: no-arbitrage is equivalent to the existence of such a measure.
The technical tool that makes the measure change rigorous is Girsanov's theorem, which says that the relationship between the two Brownian motions is:
WtQ=WtP+σμ−rt
The quantity θ=σμ−r is called the market price of risk (or Sharpe ratio). Girsanov's theorem guarantees that WtQ is indeed a standard Brownian motion under Q, provided certain integrability conditions (Novikov's condition) are satisfied.
The upshot for pricing is clean: the fair price of a derivative with payoff H at maturity T is:
V0=e−rTEQ[H]
No need to estimate μ. No need to model risk preferences. Just take the expectation under Q, where Brownian motion does the same job but with a different drift. This is the logic underpinning the Black-Scholes formula and all of its extensions. For more on measure changes, see Girsanov's theorem.
Common confusions and pitfalls
"Why can't I differentiate Wt?"
This is perhaps the most common confusion for people coming from classical calculus. The short answer: the path is too rough. Over a time interval Δt, the increment ΔW∼Δt, so the ratio ΔW/Δt∼1/Δt→∞. The path oscillates so violently at every scale that no tangent line exists, anywhere, ever (with probability one).
This does not mean we "cannot do calculus." It means we must use a different calculus — Itô calculus — in which the basic object is not dWt/dt (which doesn't exist) but dWt itself (an infinitesimal increment). Stochastic differential equations like dS=μSdt+σSdW are not equations about derivatives; they are shorthand for integral equations:
ST=S0+∫0TμStdt+∫0TσStdWt
where the second integral is an Itô integral defined as an L2 limit, not a Riemann sum.
"What does dWt actually mean?"
dWt is not a well-defined mathematical object on its own. It is a notational shorthand. When we write dSt=μStdt+σStdWt, what we really mean is the integral form above. The "differential notation" is a compact and intuitive way to express the integral equation, but it should always be understood as shorthand.
In particular, dWt does not have a "value" — you cannot evaluate it at a point. It is meaningful only inside an integral. Think of it as analogous to dx in ∫f(x)dx: the symbol dx is not a number, but it makes the integral notation work.
"Why does (dWt)2=dt?"
This is one of the most frequently misunderstood statements in stochastic calculus. It is not an algebraic identity — you are not squaring a number and getting another number. It is a statement about quadratic variation: when you sum up squared increments of Brownian motion over a partition and take the limit, you get the elapsed time.
More precisely, (dWt)2=dt is shorthand for:
i=0∑n−1(Wti+1−Wti)2L2Tas imax(ti+1−ti)→0
The "multiplication rules" of stochastic calculus — (dWt)2=dt, dt⋅dWt=0, (dt)2=0 — are not algebra. They are limit statements about how different types of infinitesimal quantities behave when summed over many small intervals. The first rule holds because Brownian increments are of order dt; the second holds because dt⋅dt=dt3/2→0 faster than dt; the third holds trivially.
These rules are what you mechanically apply when using Itô's Lemma, but it is important to remember that the justification is always convergence in probability (or L2), not algebraic manipulation.
"Brownian motion has drift zero, so it can't model a stock with positive expected return"
This confuses the role of Brownian motion with the role of the SDE. Standard Brownian motion Wt has zero drift, but the stock price model dSt=μStdt+σStdWt can have any drift μ you like. Brownian motion supplies the noise; the drift is added separately as a deterministic term μStdt. And under the risk-neutral measure Q, the drift is r (the risk-free rate), not μ — so even the meaning of "drift" changes with the measure.
"If (dWt)2=dt is deterministic, doesn't that mean we can predict Wt?"
No. The quadratic variation [W]t=t is deterministic, but quadratic variation tells you about the accumulated squared fluctuation — it says how much total "energy" the path has spent by time t. It does not tell you the direction of any individual move. Knowing that ∑(ΔW)2≈T is like knowing the length of a tangled rope without knowing its shape. The signed increments ΔWi are still independent normals with mean zero; the information about direction cancels out when you square.
Where this goes next
Geometric Brownian Motion: The exponential of Brownian motion with drift. Gives a positive, log-normal process that is the canonical stock-price model in the Black-Scholes framework.
Itô's Lemma: The stochastic chain rule. Every calculation involving a function of Brownian motion — log-returns, option Greeks, the Black-Scholes PDE — routes through this result.
Stochastic Differential Equations: The integral-equation framework Xt=X0+∫ads+∫bdWs in which SDEs like dS=μSdt+σSdW are properly defined.
Martingales: Brownian motion is the canonical continuous martingale; the martingale representation theorem says that under mild conditions, every martingale on the Brownian filtration is a stochastic integral against W.
Change of Measure (Girsanov's theorem): Under a new measure Q, Wt plus a drift is a new Brownian motion. This is the machine that makes risk-neutral pricing rigorous.
Black-Scholes PDE: The capstone application. Brownian motion supplies the noise, Itô's lemma supplies the calculus, and Girsanov supplies the drift change.
References
Lawler, G. F. (2023). Stochastic Calculus: An Introduction with Applications. Ch. 2 §2.3 (Limits of random walks), §2.4 (Brownian motion), §2.6 (Understanding Brownian motion), §2.7 (Computations for Brownian motion), §2.8 (Quadratic variation).
Albin, P., Hamza, K., & Klebaner, F. C. (2025). Problems and Solutions in Stochastic Calculus with Applications. World Scientific. Ch. 4 (Brownian Motion Calculus) — supporting exercise checks.
Exercises
Test your understanding with 3 exercises for this lesson.