@a_karvonen on Backlist

3 appearances on the backlist front page in the last 30 days.

84.

(x.com)

Some cool work that I co-mentored with @NeelNanda5 I recommend the appendix section on practical AO evaluation details. In particular, consensus sampling significantly reduces hallucinations, and eval performance majorly improves with

by (Adam Karvonen) · backlist 2026-06-04 · rubric 72.0