@rronak_ on Backlist

3 appearances on the backlist front page in the last 30 days.

39.

SDPO++ for Continual Learning Day 5 of Trajectory, we modify Self Distillation Policy Optimization for long horizon agentic tasks. SDPO is a promising route. It learns from a single trajectory, with no group required and failures still p

by (Ronak Malde) · backlist 2026-06-02 · rubric 79.0
81.

(x.com)

Today, @MichaelElabd , @QuantumArjun , and I are excited to announce Trajectory. We are a research lab and product company building the platform for Continual Learning. Our platform unlocks the signal already sitting in product usage,

by (Ronak Malde) · backlist 2026-05-27 · rubric 86.0
55.

I’ve left Google DeepMind. The last two years have been an incredible whirlwind. A couple years ago, I joined a small startup called Codeium. There, I got to ship Windsurf, train SWE-1 (a frontier agentic coding model), go to DeepMind in

by (Ronak Malde) · backlist 2026-05-24 · rubric 88.0