32.
I find GLM-5.2 currently unusable for hard reasoning tasks. I gave it 11 induction problems from my benchmark (IC… (t.co)
I find GLM-5.2 currently unusable for hard reasoning tasks. I gave it 11 induction problems from my benchmark (ICML 2026, https:// arxiv.org/abs/2602.18956). - 4 out of the 11 completed, the rest failed; 2 correct - Average time per compl