@_selebou on Backlist

1 appearance on the backlist front page in the last 30 days.

56.

(1/6) New μP paper: Under μP, GQA passes coordinate checks but fails to transfer learning rate! That contradiction isn't just a quirk of GQA. It exposes a silent failure mode in the TPV framework that the original theory simply can't dete

by (Kyle Chickering) · backlist 2026-06-22 · rubric 100.0