63.
1/ Let me chip in on the recent “which optimizer rules them all” discussion with a somewhat more moderate take, a…
1/ Let me chip in on the recent “which optimizer rules them all” discussion with a somewhat more moderate take, asking: What Schatten-p norm to use? Turns out the answer is regime dependent! Specifically, even when smooth in Schatten-∞, M