CONTENTS

Solution: Bernoulli Sums and the de Moivre-Laplace Approximation

Part 1

For i.i.d. Bernoulli(pp) with p=0.55p = 0.55:

E[Sn]=np=55,Var(Sn)=np(1p)=1000.550.45=24.75.\mathbb{E}[S_n] = np = 55, \qquad \operatorname{Var}(S_n) = np(1-p) = 100\cdot 0.55\cdot 0.45 = 24.75.

So S100dN(55,24.75)S_{100} \overset{d}{\approx} \mathcal{N}(55,\,24.75), with standard deviation 24.754.975\sqrt{24.75} \approx 4.975.

Part 2

Without continuity correction:

P(S10060)1Φ ⁣(60554.975)=1Φ(1.005)10.8426=0.1574.\mathbb{P}(S_{100} \ge 60) \approx 1 - \Phi\!\left(\frac{60 - 55}{4.975}\right) = 1 - \Phi(1.005) \approx 1 - 0.8426 = 0.1574.

Part 3

With continuity correction:

P(S10060)1Φ ⁣(59.5554.975)=1Φ(0.904)10.8170=0.1830.\mathbb{P}(S_{100} \ge 60) \approx 1 - \Phi\!\left(\frac{59.5 - 55}{4.975}\right) = 1 - \Phi(0.904) \approx 1 - 0.8170 = 0.1830.

Part 4

from scipy.stats import binom exact = 1 - binom.cdf(59, n=100, p=0.55) print(f"Exact binomial: {exact:.4f}") # Exact binomial: 0.1827
Comparison: exact 0.18270.1827, corrected Gaussian 0.18300.1830 (off by 0.00030.0003), uncorrected Gaussian 0.15740.1574 (off by 0.0250.025). The continuity correction is dramatically better — an order of magnitude more accurate for the same computational cost.

Takeaways

  • The CLT is a continuous approximation to a discrete distribution. Shifting by half a unit to straddle the integer (continuity correction) recovers almost all the accuracy lost to the discretisation.
  • At n=100n = 100 the uncorrected approximation is already off by 15% relative. For smaller nn the error explodes; use exact binomial, not Gaussian, for n25n \lesssim 25.
  • Always use the continuity correction on discrete CLT problems. The cost is typing ±0.5; the gain is an order of magnitude in accuracy.
  • This is the de Moivre-Laplace theorem — historically the first CLT, proved by de Moivre for fair coins (p=1/2p = 1/2) in 1733 and generalised by Laplace to arbitrary p(0,1)p \in (0, 1) in 1810. It predates the general CLT by over a century.