CONTENTS

Exercise: Deriving the Normal Equations by Calculus

Problem

We have data (x1,y1),,(xn,yn)(x_1, y_1), \ldots, (x_n, y_n) with xi,yiRx_i, y_i \in \mathbb{R} (simple 1-d regression with intercept). The model is yi=α+βxi+ϵiy_i = \alpha + \beta x_i + \epsilon_i.

  1. Write the sum of squared residuals L(α,β)=i=1n(yiαβxi)2L(\alpha, \beta) = \sum_{i=1}^n (y_i - \alpha - \beta x_i)^2 and take partial derivatives with respect to α\alpha and β\beta. Set both to zero.

  2. Solve the resulting two equations to derive the classical closed forms β^=i(xixˉ)(yiyˉ)i(xixˉ)2,α^=yˉβ^xˉ.\hat\beta = \frac{\sum_i (x_i - \bar x)(y_i - \bar y)}{\sum_i (x_i - \bar x)^2}, \quad \hat\alpha = \bar y - \hat\beta \bar x.

  3. Verify that β^\hat\beta equals the sample covariance of (x,y)(x, y) divided by the sample variance of xx: β^=Cov^(x,y)/Var^(x)\hat\beta = \widehat{\text{Cov}}(x, y)/\widehat{\text{Var}}(x).

  4. Numerical sanity check. Generate n=100n = 100 points with xiN(0,1)x_i \sim \mathcal{N}(0, 1), yi=2+3xi+ϵiy_i = 2 + 3 x_i + \epsilon_i, ϵiN(0,1)\epsilon_i \sim \mathcal{N}(0, 1). Use numpy.polyfit or your closed-form formulas to estimate α^,β^\hat\alpha, \hat\beta. Both should be close to (2,3)(2, 3).

Hint

For part 1: L/α=2(yiαβxi)=0\partial L/\partial \alpha = -2\sum (y_i - \alpha - \beta x_i) = 0 gives yi=nα+βxi\sum y_i = n\alpha + \beta\sum x_i. Similarly for β\beta.

Jump to the solution when you're ready.