@agarwl_ on Backlist

2 appearances on the backlist front page in the last 30 days.

18.

On-policy distillation with positive-pressure tokens

Using only tokens where the teacher assigns higher probability than the student can still minimize an upper bound on on-policy distillation loss

by @agarwl_ (Rishabh Agarwal) · backlist 2026-06-13 · rubric 81.0

74.

The quest for reliable on-policy self-distillation continues. Hope something would stand the test of time.

The quest for reliable on-policy self-distillation continues. Hope something would stand the test of time.

by @agarwl_ (Rishabh Agarwal) · backlist 2026-05-23 · rubric 78.0