33.
I'd like to add to the momentum discourse that nesterov-momentum can be seen as a form of PSGD w criteria c(m; g)…
I'd like to add to the momentum discourse that nesterov-momentum can be seen as a form of PSGD w criteria c(m; g) = <m, m> - <m, g>, which has minima m = E[g]. The application of m to g is an affine lie-group preconditioner, P = [[m, c], [0