@kimbochen on Backlist

2 appearances on the backlist front page in the last 30 days.

60.

(x.com)

Design GQA + top k indexer Scoring: SDPA + max pooling (Light house attn? @SubhoGhosh02 ) Training Dense warmup + KL loss to match index branch output to main branch attn output Stop gradient at index weight projection

by (Kimbo) · backlist 2026-06-11 · rubric 86.0