Solution: Why Bias Correction Changes Early Steps
Exercise: Why Bias Correction Changes Early Steps
Solution
. Then . Both are far below the constant gradient because the moving average started at zero. Dividing by gives at both and for a constant gradient.
Takeaways
- Zero initialisation creates downward bias.
- Bias correction is most important early.
- The correction fades as grows.