43.
This is like a good stress test for optimizers.
This is like a good stress test for optimizers. Kaon is basically Muon/lmo + spectral noise. It preserves the singular vectors of the gradient and randomizes only the positive singular weights. For exchangeable noise, the conditional expec