Quasi-Monte Carlo (Low-Discrepancy Sequences)

Motivation: why this matters in quant finance

Monte Carlo pricing converges at rate $O(N^{-1/2})$ . To halve the standard error you need four times the samples. For complex products with slow-running pricing functions (American options with Longstaff-Schwartz, path-dependent exotics, portfolio risk with thousands of paths), this is painful — a $10\times$ accuracy improvement costs $100\times$ compute.

Quasi-Monte Carlo (QMC) replaces random samples with deterministic, carefully designed point sets that fill the unit hypercube more uniformly than independent random samples. The convergence rate becomes

O((\log N)^d / N)

— nearly

N^{-1}

in low dimensions, and with proper handling (Sobol sequences plus randomisation for error estimates) the speedup is often

10

100\times

for the same accuracy in practical quant applications.

Every major fixed-income and equity-derivatives desk runs Sobol sequences by default for anything multi-dimensional. The underlying mathematics — discrepancy theory, the Koksma-Hlawka inequality, digital net constructions — is technical, but the operational takeaway is simple: a replace-np.random.random-with-scipy.stats.qmc.Sobol is often a one-line change that materially accelerates pricing.

This note explains what low-discrepancy means, states the Koksma-Hlawka bound, and gives the two most-used constructions (Halton and Sobol) with their practical trade-offs.

The informal idea

Monte Carlo samples $U_1, \ldots, U_N \in [0, 1]^d$ IID uniform. The integral estimator is

\hat{I}_N = \frac{1}{N}\sum_{i=1}^N f(U_i) \approx \int_{[0,1]^d} f(u)\,du.

Error: $\mathbb{E}[(\hat I_N - I)^2]^{1/2} = \sigma_f / \sqrt{N}$ . The CLT is the source of the $N^{-1/2}$ rate.

IID samples are locally clumpy: in a finite sample, some regions of

[0,1]^d

are over-represented and others under-represented. QMC replaces them with a deterministic sequence chosen so that every region's sample count is close to its volume — the sequence has low discrepancy.

The effect: for smooth integrands, QMC errors decay like $(\log N)^d / N$ instead of $1/\sqrt{N}$ . For $d = 10$ and $N = 10^6$ , that's roughly $10^{-2}$ vs $10^{-3}$ — the log factor takes away some of the gain in high dimensions, but QMC still dominates for $d$ up to maybe $30$ or so in practice.

Formal definitions

Discrepancy

Let

\mathcal{P} = \{x_1, \ldots, x_N\} \subset [0, 1]^d

. The star-discrepancy is

D_N^*(\mathcal{P}) := \sup_{B = [0, b_1] \times \cdots \times [0, b_d]}\left|\frac{|\{x_i \in B\}|}{N} - \text{vol}(B)\right|.

In plain English: the worst-case discrepancy between "fraction of points in box $B$ " and "volume of $B$ ," over all boxes anchored at the origin.

A sequence is low-discrepancy if

D_N^* = O((\log N)^d / N)

. For IID uniform samples,

D_N^* \sim \sqrt{\log\log N / N}

(law of the iterated logarithm) — slower decay by a factor of

\sqrt{N}/\text{polylog}(N)

Koksma-Hlawka inequality

For a function $f$ of bounded variation $V(f)$ (in the Hardy-Krause sense),

\left|\hat I_N(\mathcal{P}) - I\right| \le V(f) \cdot D_N^*(\mathcal{P}).

Interpretation. Integration error is bounded by (smoothness of integrand) × (uniformity of points). Fix the integrand; choosing low-discrepancy points instead of random ones multiplies the error bound by the ratio of discrepancies.

Two caveats:

$V(f)$ is often infinite for realistic pricing functions (e.g., $\max(0, S - K)$ has a kink). Bounded variation holds for smooth integrands, and near-smooth ones (with mild kinks) perform well in practice.
Koksma-Hlawka is a worst-case bound. The average behaviour can be substantially better or (rarely) worse.

Halton sequence

The $d$ -dimensional Halton sequence uses the radical-inverse function $\phi_b(n)$ in base $b$ : write $n = \sum_k a_k b^k$ and let $\phi_b(n) = \sum_k a_k b^{-k-1}$ . The first few terms of the base-2 radical inverse are $\phi_2(1) = 0.5, \phi_2(2) = 0.25, \phi_2(3) = 0.75, \ldots$

The Halton sequence is

x_n = (\phi_{p_1}(n), \phi_{p_2}(n), \ldots, \phi_{p_d}(n)), \qquad p_1, p_2, \ldots = 2, 3, 5, 7, 11, \ldots \text{(primes)}.

Strength. Easy to construct. Discrepancy

O((\log N)^d / N)

Weakness. High-dimensional projections are bad. The Halton 20th vs 30th dimensions (using prime bases

71, 113

) show visible linear patterns — pairs of dimensions collapse onto a line. This is the reason Halton is rarely used beyond

d \approx 5

Sobol sequence

The Sobol sequence is a digital $(t, d)$ -sequence in base 2. The construction uses "direction numbers" — a set of

d

binary sequences — chosen so each

d

-dimensional projection of the sequence satisfies a stratification property.

The details are technical (primitive polynomials over $\mathbb{F}_2$ , recursive direction-number construction), but the operational picture is:

Each dimension's values are generated by XORing the direction-numbers of the bits of $n$ .
Good direction numbers (Joe-Kuo 2008 being the gold-standard choice) extend Sobol's practical utility to $d \ge 1000$ .
Sobol sequences are the default in every serious numerical library (SciPy's qmc.Sobol, QuantLib, MATLAB's sobolset).

Key properties

Dimension dependence. In dimension $d$ , the $(\log N)^d$ factor grows rapidly. For $d = 30$ , $(\log N)^{30}$ is already astronomical. But the effective dimension of most financial integrands is small — most of the variance is captured by the first few dimensions, and the remaining dimensions contribute little.
Brownian bridge construction. For path-dependent options, using Brownian bridge to generate paths places the most important Brownian increments in the low-index Sobol dimensions. This often reduces effective dimension dramatically.
Randomisation. Pure QMC gives a deterministic answer with no error estimate. Scrambled Sobol or randomised Halton add a random shift/scramble, preserving the low-discrepancy property but allowing standard-error estimation via independent replicates.
Non-uniform sampling. QMC generates points in $[0, 1]^d$ . Convert to standard normal via $\Phi^{-1}(u_i)$ , which preserves low discrepancy (modulo the boundary behaviour of $\Phi^{-1}$ ).

Worked example — pricing a European call with QMC

Black-Scholes European call pricing under the risk-neutral measure reduces to a one-dimensional integral. For a geometric-average Asian option with $d$ monitoring dates, the integral is $d$ -dimensional.

Pseudocode:

import numpy as np
from scipy.stats import qmc, norm

def sobol_mc_call(S0, K, r, sigma, T, n_steps, n_paths, seed=0):
    dt = T / n_steps
    engine = qmc.Sobol(d=n_steps, scramble=True, seed=seed)
    u = engine.random(n=n_paths)
    z = norm.ppf(u)
    log_paths = np.cumsum(
        (r - 0.5 * sigma**2) * dt + sigma * np.sqrt(dt) * z,
        axis=1,
    )
    S = S0 * np.exp(log_paths)
    payoff = np.maximum(S.mean(axis=1) - K, 0.0)
    return np.exp(-r * T) * payoff.mean()

Replace qmc.Sobol with np.random.default_rng for pure MC. For

n_{\text{steps}} = 32

n_{\text{paths}} = 4096

, pricing a geometric Asian on

S_0 = 100, K = 100, \sigma = 0.2, r = 0.05, T = 1

Pure MC standard error (estimated via 20 independent runs): approximately $0.0125$ on a price of $~7.02$ .
Scrambled Sobol standard error: approximately $0.0018$ on the same price.

About a $7\times$ reduction in standard error at the same sample size — equivalent to $\sim 50\times$ fewer samples to reach the same accuracy.

Common confusions and pitfalls

QMC errors are not Gaussian. A single QMC run gives a deterministic number. To get a standard-error-like quantity, use randomised QMC (scrambled Sobol) with multiple independent replicates. Taking the $0.95$ -CI of pure QMC "replicates" that are just different starting offsets is a common mistake.
Skipping the initial Sobol points. Classical advice was to skip the first $2^k$ Sobol points for some $k$ . With modern direction numbers (Joe-Kuo), this is unnecessary and may even slightly hurt. Current practice: don't skip.
Effective dimension matters. A naive $d = 250$ Sobol sequence for a 250-step Asian may barely beat Monte Carlo. With Brownian bridge sampling, effective dimension can drop to $\sim 10$ and the speedup becomes dramatic.
Box-Muller transforms degrade Sobol. For Gaussian sampling, always use inverse-CDF ( $\Phi^{-1}$ ), not Box-Muller. Box-Muller pairs two independent uniforms in a non-monotone way that destroys the Sobol structure.
Koksma-Hlawka requires bounded variation. Discontinuous payoffs (digital options, barrier options) technically violate the assumption. In practice QMC usually still helps, but gains are smaller and smooth replications (e.g., a narrow sigmoid for a digital) recover them.

Where this goes next

Monte Carlo Pricing — the baseline technique QMC replaces.
Monte Carlo Variance Reduction: Antithetic Variates — an orthogonal way to reduce variance; combines with QMC.
Monte Carlo Variance Reduction: Importance Sampling — works well alongside Sobol.