Solution: Proving the Weak LLN from Chebyshev

Exercise: Proving the Weak LLN from Chebyshev

Part 1

By linearity of expectation (no independence required):

\mathbb{E}[\bar X_n] = \frac{1}{n}\sum_{i=1}^n \mathbb{E}[X_i] = \mu.

For the variance, expand:

\operatorname{Var}(\bar X_n) = \frac{1}{n^2}\operatorname{Var}\!\left(\sum_{i=1}^n X_i\right) = \frac{1}{n^2}\left(\sum_{i=1}^n \operatorname{Var}(X_i) + 2\sum_{i < j}\operatorname{Cov}(X_i, X_j)\right).

Pairwise uncorrelated means

\operatorname{Cov}(X_i, X_j) = 0

for

i \ne j

, so the second sum vanishes, leaving:

\operatorname{Var}(\bar X_n) = \frac{1}{n^2}\cdot n\sigma^2 = \frac{\sigma^2}{n}.

This is the step — and the only step — that needs the uncorrelatedness assumption.

Part 2

Chebyshev's inequality. For a random variable

Y

with finite variance

\operatorname{Var}(Y)

and any

\epsilon > 0

\mathbb{P}(|Y - \mathbb{E}[Y]| > \epsilon) \le \frac{\operatorname{Var}(Y)}{\epsilon^2}.

Apply with $Y = \bar X_n$ , $\mathbb{E}[Y] = \mu$ , $\operatorname{Var}(Y) = \sigma^2/n$ :

\mathbb{P}(|\bar X_n - \mu| > \epsilon) \le \frac{\sigma^2}{n\epsilon^2}.

Part 3

For any fixed $\epsilon > 0$ , the RHS $\sigma^2/(n\epsilon^2) \to 0$ as $n \to \infty$ . By the definition of convergence in probability, $\bar X_n \xrightarrow{\mathbb{P}} \mu$ .

Part 4

With $\operatorname{Cov}(X_i, X_j) = \rho\sigma^2$ for $i \ne j$ , the covariance sum has $\binom{n}{2}$ terms:

\operatorname{Var}(\bar X_n) = \frac{1}{n^2}\left(n\sigma^2 + 2\cdot\binom{n}{2}\cdot\rho\sigma^2\right) = \frac{\sigma^2}{n} + \frac{n-1}{n}\rho\sigma^2.

n \to \infty

, the first term vanishes but the second approaches

\rho\sigma^2 > 0

. The variance of

\bar X_n

does not go to zero; Chebyshev gives a floor, not a shrinking bound.

Why this matters for time series. Daily returns on most momentum signals are positively autocorrelated (otherwise the signal would have no forecasting power). If we mechanically apply the LLN to a time series and report

\text{SE} = \sigma/\sqrt n

, we ignore the autocorrelation and underestimate the true standard error of

\bar X_n

. The corrected formula — the "effective sample size" or "Newey-West standard error" — replaces

\sigma^2/n

with the long-run variance, which can be an order of magnitude larger.

Practical takeaway: the sample mean of an autocorrelated series converges (under weaker conditions like ergodicity), but more slowly than naïve LLN would suggest.

Takeaways

Uncorrelated suffices for the weak LLN — full independence is overkill. Chebyshev is the workhorse because it asks only for a second moment.
Variance of the sample mean $=\sigma^2/n$ is the foundational formula of frequentist statistics. Every standard error calculation descends from it.
Positive autocorrelation breaks the $\sigma^2/n$ formula. Use HAC / Newey-West standard errors for backtested time series, not the naive IID formula.
Chebyshev is loose but universal. Sharper bounds (Bernstein, Hoeffding, Cramér) exist for sub-Gaussian or bounded variables, but they require stronger distributional assumptions.