@kimbochen on Backlist

2 appearances on the backlist front page in the last 30 days.

50.

Systems and algorithms have never been more entangled for RL

Systems and algorithms have never been more entangled for RL Why apply importance sampling? Why partial rollout? Why is inference paradoxically the major part of RL training? Here we build the basic intuition for what are the critical conce

by @kimbochen (Kimbo) · backlist 2026-06-16 · rubric 76.0

60.

Design (x.com)

Design GQA + top k indexer Scoring: SDPA + max pooling (Light house attn? @SubhoGhosh02 ) Training Dense warmup + KL loss to match index branch output to main branch attn output Stop gradient at index weight projection

by @kimbochen (Kimbo) · backlist 2026-06-11 · rubric 86.0