45.
We've always intuited that verification is easier than generation. Chen's new work shows that explicitly training…
We've always intuited that verification is easier than generation. Chen's new work shows that explicitly training for it unlocks massive self-improvement: 14× boost in test-time refinement on hard reasoning 30% gain beyond the RL plateau