70.
I'd like to issue a clarification on the section covered around 1:33:44 of the podcast, where (x.com)
I'd like to issue a clarification on the section covered around 1:33:44 of the podcast, where @dwarkesh_sp and I discuss multi-step vs. single-step LLM RL. I muddle through some hand-wavy quadratic variance statements around multi-step RL