Solution: When the CLT Fails — Cauchy Samples

Exercise: When the CLT Fails: Cauchy Samples

Part 1

import numpy as np

rng = np.random.default_rng(0)
N = 100_000
for n in [1, 10, 100, 1000]:
    X = rng.standard_cauchy(size=(N, n))
    mean_n = X.mean(axis=1)
    p50, p95, p99 = np.percentile(mean_n, [50, 95, 99])
    print(f"n={n:5d}:  median={p50:+.3f},  95%={p95:+.2f},  99%={p99:+.2f}")
# n=    1:  median=-0.003,  95%=+6.33,  99%=+31.85
# n=   10:  median=+0.005,  95%=+6.34,  99%=+32.00
# n=  100:  median=+0.002,  95%=+6.31,  99%=+31.78
# n= 1000:  median=-0.001,  95%=+6.32,  99%=+31.72

The tail quantiles do not shrink with

n

— the distribution of

\bar X_n

is essentially the same at

n = 1

and

n = 1000

Part 2

The empirical 95th and 99th percentiles match the standard Cauchy's $\approx 6.31$ and $\approx 31.82$ at every $n$ . Overlaying the density:

import matplotlib.pyplot as plt
x = np.linspace(-10, 10, 1000)
for n in [1, 100, 1000]:
    fig, ax = plt.subplots()
    X = rng.standard_cauchy(size=(N, n)).mean(axis=1)
    ax.hist(X, bins=200, range=(-10, 10), density=True, alpha=0.7, label=f"sample mean, n={n}")
    ax.plot(x, 1/(np.pi*(1 + x**2)), 'r-', label="standard Cauchy density")
    ax.legend()
# all three histograms coincide with the Cauchy density

The histograms for every $n$ coincide with the standard Cauchy density. Averaging does nothing.

Part 3

LLN fails because the true mean is undefined — there is no number for

\bar X_n

to converge to. CLT fails because the variance is infinite — the

\sigma\sqrt n

scaling is meaningless when

\sigma = \infty

. Both theorems require finite moments; the Cauchy has neither.

Formally: $\bar X_n \overset{d}{=} X_1$ for all $n$ (stability under averaging). The distribution is fixed; it is not approaching any degenerate point mass (LLN) nor any Gaussian after $\sqrt n$ -rescaling (CLT).

Part 4

If intraday returns were literally Cauchy, the sample mean would never converge — more data would produce more noise, not less. The standard "compute an average to see the drift" workflow would be statistically hopeless. Every "estimate of expected return" would be as uncertain as a single observation.

Why it matters despite real returns not being Cauchy. Real intraday returns exhibit finite variance but heavy tails (kurtosis > 3). The CLT still applies, but the Berry-Esseen constant

\rho/\sigma^3

is large, so convergence is slow. Typical intraday-return samples behave partway between Gaussian (well-behaved) and Cauchy (pathological): the sample mean converges, but much more slowly than the

1/\sqrt n

rate would naively suggest. Quants handle this by using bootstrapped confidence intervals, robust estimators (median instead of mean), or block-bootstrap methods that respect dependence.

Takeaways

Finite variance is not optional — the CLT proof's Taylor expansion of the characteristic function at $0$ requires a finite second derivative, i.e. a finite second moment.
The Cauchy is the canonical counterexample. Stable under averaging, no mean, no variance. Every proof of the LLN or CLT must rule Cauchy out.
Heavy tails degrade the CLT smoothly. Real-return distributions have finite variance but large kurtosis, which means convergence is slow, not absent. Use bootstrap rather than $\mathcal{N}(0,1)$ quantiles for robust inference.
Stable distributions (of which the Cauchy is one) replace the Gaussian as the limiting distribution when the $X_i$ have infinite variance. The theory of $\alpha$ -stable laws generalises the CLT to the heavy-tailed regime.