70.
Getting the chat template consistent across multiple turns for agentic training can be much more tricker than peo…
Getting the chat template consistent across multiple turns for agentic training can be much more tricker than people think. There have been headaches like reasoning trajectory pruned by chat templates, detokenize-retokenize mismatch etc.