@a_karvonen on Backlist

3 appearances on the backlist front page in the last 30 days.

10.

Training Common Crawl chronologically reduces recency failure

Models trained on web data ordered from 2018 to 2025 performed much better on recent facts than models seeing the same data in a nonsequential order

by @a_karvonen (Adam Karvonen) · backlist 2026-06-06 · rubric 78.0

84.

Some cool work that I co-mentored with (x.com)

Some cool work that I co-mentored with @NeelNanda5 I recommend the appendix section on practical AO evaluation details. In particular, consensus sampling significantly reduces hallucinations, and eval performance majorly improves with

by @a_karvonen (Adam Karvonen) · backlist 2026-06-04 · rubric 72.0

30.

I am truly blown away by Qwen-3.5-27B.

I am truly blown away by Qwen-3.5-27B. It's doing better than Haiku 4.5 on my OOD interp task that involves 50k context in an agentic setting. Such a great cheap model for research tasks.

by @a_karvonen (Adam Karvonen) · backlist 2026-05-30 · rubric 88.0