CONTENTS

Solution: MLE for a Normal with Unknown Mean and Variance

Parts 1–2

(μ,σ2)=n2log(2πσ2)12σ2(xiμ)2\ell(\mu, \sigma^2) = -\frac{n}{2}\log(2\pi\sigma^2) - \frac{1}{2\sigma^2}\sum(x_i - \mu)^2.

/μ=1σ2(xiμ)=0μ^=xˉ\partial\ell/\partial\mu = \tfrac{1}{\sigma^2}\sum(x_i - \mu) = 0 \Rightarrow \hat\mu = \bar x.

/σ2=n2σ2+12σ4(xiμ)2=0σ2=1n(xiμ)2\partial\ell/\partial\sigma^2 = -\tfrac{n}{2\sigma^2} + \tfrac{1}{2\sigma^4}\sum(x_i - \mu)^2 = 0 \Rightarrow \sigma^2 = \tfrac{1}{n}\sum(x_i - \mu)^2. Substituting μ^\hat\mu:

σ^2=1n(xixˉ)2.\hat\sigma^2 = \frac{1}{n}\sum(x_i - \bar x)^2.

Part 3

E[(XiXˉ)2]=E[(Xiμ)2n(Xˉμ)2]=nσ2nσ2/n=(n1)σ2\mathbb{E}[\sum(X_i - \bar X)^2] = \mathbb{E}[\sum(X_i - \mu)^2 - n(\bar X - \mu)^2] = n\sigma^2 - n\cdot\sigma^2/n = (n-1)\sigma^2.

So E[σ^2]=(n1)σ2/n<σ2\mathbb{E}[\hat\sigma^2] = (n-1)\sigma^2/n < \sigma^2. MLE variance is biased downward.

Part 4

Bessel-corrected estimator: σ^Bessel2=1n1(xixˉ)2\hat\sigma^2_{\text{Bessel}} = \frac{1}{n-1}\sum(x_i - \bar x)^2, with E[σ^Bessel2]=σ2\mathbb{E}[\hat\sigma^2_{\text{Bessel}}] = \sigma^2. This is what numpy.var(x, ddof=1) computes.

Part 5 — Simulation

import numpy as np rng = np.random.default_rng(0) m, n = 10_000, 10 X = rng.standard_normal((m, n)) mle_var = np.mean(X.var(axis=1, ddof=0)) bessel_var = np.mean(X.var(axis=1, ddof=1)) print(f"average MLE variance: {mle_var:.4f} (expected (n-1)/n = {(n-1)/n:.4f})") print(f"average Bessel variance: {bessel_var:.4f} (expected 1.0)") # average MLE variance: 0.9036 (expected (n-1)/n = 0.9000) # average Bessel variance: 1.0040 (expected 1.0)

Takeaways

  • MLE for μ\mu is xˉ\bar x and is unbiased (E[Xˉ]=μ\mathbb{E}[\bar X] = \mu).
  • MLE for σ2\sigma^2 is biased downward by a factor (n1)/n(n-1)/n. The bias arises because (xixˉ)2<(xiμ)2\sum(x_i - \bar x)^2 < \sum(x_i - \mu)^2 — using xˉ\bar x instead of μ\mu reduces variability by one degree of freedom.
  • Bessel correction divides by n1n - 1 instead of nn to recover an unbiased estimator. This is the default in scientific computing libraries.
  • MLE's asymptotic unbiasedness kicks in as nn \to \infty: (n1)/n1(n-1)/n \to 1. For nn in the thousands, the MLE and Bessel estimators are indistinguishable; in small samples (monthly data, rare events), the distinction matters.
  • Efficiency vs. unbiasedness trade-off. The MLE has smaller variance than the Bessel-corrected estimator, but is biased. The MSE of MLE is slightly smaller for small nn.
Solution — MLE for a Normal with Unknown Mean and Variance | q4quant.studio