CONTENTS

Solution: Proving the Weak LLN from Chebyshev

Part 1

By linearity of expectation (no independence required):

E[Xˉn]=1ni=1nE[Xi]=μ.\mathbb{E}[\bar X_n] = \frac{1}{n}\sum_{i=1}^n \mathbb{E}[X_i] = \mu.

For the variance, expand:

Var(Xˉn)=1n2Var ⁣(i=1nXi)=1n2(i=1nVar(Xi)+2i<jCov(Xi,Xj)).\operatorname{Var}(\bar X_n) = \frac{1}{n^2}\operatorname{Var}\!\left(\sum_{i=1}^n X_i\right) = \frac{1}{n^2}\left(\sum_{i=1}^n \operatorname{Var}(X_i) + 2\sum_{i < j}\operatorname{Cov}(X_i, X_j)\right).
Pairwise uncorrelated means Cov(Xi,Xj)=0\operatorname{Cov}(X_i, X_j) = 0 for iji \ne j, so the second sum vanishes, leaving:
Var(Xˉn)=1n2nσ2=σ2n.\operatorname{Var}(\bar X_n) = \frac{1}{n^2}\cdot n\sigma^2 = \frac{\sigma^2}{n}.

This is the step — and the only step — that needs the uncorrelatedness assumption.

Part 2

Chebyshev's inequality. For a random variable YY with finite variance Var(Y)\operatorname{Var}(Y) and any ϵ>0\epsilon > 0:
P(YE[Y]>ϵ)Var(Y)ϵ2.\mathbb{P}(|Y - \mathbb{E}[Y]| > \epsilon) \le \frac{\operatorname{Var}(Y)}{\epsilon^2}.

Apply with Y=XˉnY = \bar X_n, E[Y]=μ\mathbb{E}[Y] = \mu, Var(Y)=σ2/n\operatorname{Var}(Y) = \sigma^2/n:

P(Xˉnμ>ϵ)σ2nϵ2.\mathbb{P}(|\bar X_n - \mu| > \epsilon) \le \frac{\sigma^2}{n\epsilon^2}.

Part 3

For any fixed ϵ>0\epsilon > 0, the RHS σ2/(nϵ2)0\sigma^2/(n\epsilon^2) \to 0 as nn \to \infty. By the definition of convergence in probability, XˉnPμ\bar X_n \xrightarrow{\mathbb{P}} \mu.

Part 4

With Cov(Xi,Xj)=ρσ2\operatorname{Cov}(X_i, X_j) = \rho\sigma^2 for iji \ne j, the covariance sum has (n2)\binom{n}{2} terms:

Var(Xˉn)=1n2(nσ2+2(n2)ρσ2)=σ2n+n1nρσ2.\operatorname{Var}(\bar X_n) = \frac{1}{n^2}\left(n\sigma^2 + 2\cdot\binom{n}{2}\cdot\rho\sigma^2\right) = \frac{\sigma^2}{n} + \frac{n-1}{n}\rho\sigma^2.
As nn \to \infty, the first term vanishes but the second approaches ρσ2>0\rho\sigma^2 > 0. The variance of Xˉn\bar X_n does not go to zero; Chebyshev gives a floor, not a shrinking bound.
Why this matters for time series. Daily returns on most momentum signals are positively autocorrelated (otherwise the signal would have no forecasting power). If we mechanically apply the LLN to a time series and report SE=σ/n\text{SE} = \sigma/\sqrt n, we ignore the autocorrelation and underestimate the true standard error of Xˉn\bar X_n. The corrected formula — the "effective sample size" or "Newey-West standard error" — replaces σ2/n\sigma^2/n with the long-run variance, which can be an order of magnitude larger.
Practical takeaway: the sample mean of an autocorrelated series converges (under weaker conditions like ergodicity), but more slowly than naïve LLN would suggest.

Takeaways

  • Uncorrelated suffices for the weak LLN — full independence is overkill. Chebyshev is the workhorse because it asks only for a second moment.
  • Variance of the sample mean =σ2/n=\sigma^2/n is the foundational formula of frequentist statistics. Every standard error calculation descends from it.
  • Positive autocorrelation breaks the σ2/n\sigma^2/n formula. Use HAC / Newey-West standard errors for backtested time series, not the naive IID formula.
  • Chebyshev is loose but universal. Sharper bounds (Bernstein, Hoeffding, Cramér) exist for sub-Gaussian or bounded variables, but they require stronger distributional assumptions.
Solution — Proving the Weak LLN from Chebyshev | q4quant.studio