DistIL starts from a simple weakness in RLVR: most of the signal is still one bit at the end.
DistIL starts from a simple weakness in RLVR: most of the signal is still one bit at the end. Instead, it uses richer feedback: execution traces, tool outputs, expert corrections, ground-truth solutions, or model critiques. It replaces re