41.
New work: The Value Axis
New work: The Value Axis How do LLMs choose which path to take mid-task? We find they internally track the chance of reaching their goal along a linear axis, akin to a value function in RL. We show it modulates confidence in math & coding