73.
I think there is maybe some interplay between LLM architecture, RL and instruct data that is missing. Right now L…
I think there is maybe some interplay between LLM architecture, RL and instruct data that is missing. Right now LLM "processing depth" seems really shallow, reasoning traces look like retrieval + incremental reasoning, which is not what one