42.
tilde research just found a massive flaw in the muon optimizer powering deepseek v4 and kimi k2.5
tilde research just found a massive flaw in the muon optimizer powering deepseek v4 and kimi k2.5 turns out muon permanently kills over 25% of your mlp neurons in early training so they built aurora to fix it and the benchmarks are actuall