CONTENTS

Markov Chains

Motivation: why this matters in quant finance

A Markov chain is a stochastic model where the current state contains all information needed to predict the next state. In quant finance, that idea appears everywhere: credit ratings migrate between AAA, AA, A, BBB and default; volatility regimes switch between calm and stressed; limit-order-book states update tick by tick; queue lengths in execution models evolve as orders arrive and depart.

The Markov assumption is powerful because it turns path-dependent uncertainty into state-dependent recursion. Instead of carrying the full history, pricing and risk calculations update a vector of state probabilities:

πn+1=πnP.\boldsymbol{\pi}_{n+1}=\boldsymbol{\pi}_n P.
That one line is why Markov chains are the finite-state ancestor of infinitesimal generators, Markov diffusions, Feynman-Kac, and regime-switching finance models.

The informal idea

The Markov property says: given the present, the future is independent of the past.

If XnX_n is today's regime, then knowing yesterday's and last week's regimes does not improve the forecast of tomorrow once XnX_n is known. The whole modelling burden is pushed into the transition probabilities:

pij=P(Xn+1=jXn=i).p_{ij}=\mathbb{P}(X_{n+1}=j\mid X_n=i).

For a finite state space, the probabilities form a transition matrix P=(pij)P=(p_{ij}). Each row sums to one, because from state ii the chain must move somewhere:

jpij=1.\sum_j p_{ij}=1.

Formal definition

Let (Xn)n0(X_n)_{n\ge0} take values in a finite or countable state space SS. It is a discrete-time Markov chain if for all states i0,,in,ji_0,\ldots,i_n,j,
P(Xn+1=jX0=i0,,Xn=in)=P(Xn+1=jXn=in).\mathbb{P}(X_{n+1}=j\mid X_0=i_0,\ldots,X_n=i_n) =\mathbb{P}(X_{n+1}=j\mid X_n=i_n).
If the right-hand side does not depend on nn, the chain is time-homogeneous and has transition matrix PP with entries
pij=P(Xn+1=jXn=i).p_{ij}=\mathbb{P}(X_{n+1}=j\mid X_n=i).

The kk-step transition probabilities are entries of PkP^k:

P(Xn+k=jXn=i)=(Pk)ij.\mathbb{P}(X_{n+k}=j\mid X_n=i)=(P^k)_{ij}.

Key properties

State distributions evolve linearly

If πn\boldsymbol{\pi}_n is a row vector with πn(j)=P(Xn=j)\pi_n(j)=\mathbb{P}(X_n=j), then

πn+1=πnP,πn=π0Pn.\boldsymbol{\pi}_{n+1}=\boldsymbol{\pi}_n P, \qquad \boldsymbol{\pi}_n=\boldsymbol{\pi}_0P^n.

This is the finite-state version of a Kolmogorov forward equation.

Communicating classes

State jj is reachable from state ii if (Pn)ij>0(P^n)_{ij}>0 for some nn. States communicate if each is reachable from the other. Communicating classes partition the state space and determine long-run behaviour.

In a credit-rating chain, the default state is often absorbing: once reached, the chain stays there. That single modelling choice changes pricing, risk, and expected-loss calculations.

Stationary distributions

A distribution π\boldsymbol{\pi} is stationary if

πP=π.\boldsymbol{\pi}P=\boldsymbol{\pi}.

For an irreducible finite chain, a stationary distribution exists and is unique. Under additional aperiodicity, πn\boldsymbol{\pi}_n converges to it from any starting state.

Absorption probabilities

If a set of states is absorbing, many questions reduce to linear equations. For an absorption probability hih_i,

hi=jpijhjh_i=\sum_j p_{ij}h_j

on transient states, with boundary values fixed on absorbing states. This is the discrete analogue of solving a boundary-value problem for a diffusion.

Continuous-time chains

A continuous-time Markov chain waits an exponential time in state ii and then jumps to a new state jj. Bertsekas describes this with transition rates

qij=vipij,q_{ij}=v_i p_{ij},

where viv_i is the total rate of leaving state ii. The rate matrix is the finite-state version of a generator.

Worked example: two-state credit regime

Let Xn{G,B}X_n\in\{G,B\} represent a borrower's annual credit regime: good or bad. Suppose

P=(0.920.080.350.65).P= \begin{pmatrix} 0.92 & 0.08 \\ 0.35 & 0.65 \end{pmatrix}.

If the borrower starts good, π0=(1,0)\boldsymbol{\pi}_0=(1,0), then after two years

π2=π0P2=(0.92,0.08)(0.920.080.350.65)=(0.8744,0.1256).\boldsymbol{\pi}_2 =\boldsymbol{\pi}_0P^2 =(0.92,0.08) \begin{pmatrix} 0.92 & 0.08 \\ 0.35 & 0.65 \end{pmatrix} =(0.8744,0.1256).

Even though the one-year downgrade probability is only 8%8\%, the two-year bad-regime probability is higher because the chain can enter bad in year one and remain there.

Common confusions and pitfalls

"Markov means independent over time." No. Consecutive states are usually dependent. Markov means the dependence is summarised by the current state.
"The transition matrix tells you the realised path." It tells you probabilities, not a deterministic trajectory. Simulation still requires random draws.
"Stationary means constant sample paths." No. A stationary distribution can coexist with active state switching; only the marginal distribution is unchanged.
"Every Markov chain converges to a stationary distribution." Finite irreducible aperiodic chains do. Periodic, reducible, or absorbing chains require separate analysis.
"Continuous-time chains are just discrete chains with smaller time steps." They are related, but continuous-time chains are specified by exponential holding times and rates. The generator, not a one-step matrix alone, is the natural object.

Where this goes next

References

  • Lawler, G. F. (2023). Stochastic Calculus: An Introduction with Applications. Ch. 1 §1.4 (Martingale convergence theorem; Markov property in Polya's urn), Ch. 6 §6.2 (Poisson process and generators).
  • Bertsekas, D. P., & Tsitsiklis, J. N. (2008). Introduction to Probability (2nd ed.). Athena Scientific. Ch. 7 §7.1 (Discrete-Time Markov Chains), §7.2 (Classification of States), §7.3 (Steady-State Behavior), §7.4 (Absorption Probabilities), §7.5 (Continuous-Time Markov Chains).