@CalcCon on Backlist

61.

An excerpt from my upcoming paper: Renormalization Group Theory of Learning

An excerpt from my upcoming paper: Renormalization Group Theory of Learning Among other things, the RG approach helps explain what the Muon optimizer is doing. Here, I show a simple experiment (MNIST/MLP3) where AdamW overfits almost imm

by @CalcCon (Calc Consulting) · backlist 2026-05-30 · rubric 78.0