Solution: Compute the First Adam Update by Hand
Exercise: Compute the First Adam Update by Hand
Solution
and . Bias correction gives and . The step is .
Takeaways
- Bias correction exactly removes the initial zero bias at .
- The first Adam step has sign equal to the gradient sign.
- The scale is controlled by when .