Backlist — 11 May 2026 UTC

Kept the AI slate to one major model release plus the most substantive infra, benchmark, biology, and labor-market items; left out several near-duplicate coding-agent and OpenAI launch posts to preserve breadth.

14.

(t.co)

River’s resumable jobs let expensive substeps such as spinning up an agent sandbox run once and survive retries, reducing wasted work in background processing

by (Brandur) · backlist 2026-05-11 · rubric 88.0
38.

3/5 Extreme Computation Efficiency Thanks to the LLaVA-UHD v4 technique, MiniCPM-V 4.6 nearly flattens the "Resolution-Latency" curve. Even with 3136² high-res images, TTFT is just 75.7ms on a 4090 GPU — 2.2x faster than Qwen3.5-0.8B. On

by · backlist 2026-05-11 · rubric 93.0
43.

(x.com)

1/?) As promised to Sander Dieleman ( @sedielem ), we’re finally excited to share: Towards Closing the Autoregressive Gap in Language Modeling via Entropy-Gated Continuous Bitstream Diffusion We show that continuous diffusion can achieve

by (Luca Ambrogioni) · backlist 2026-05-11 · rubric 92.0
47.

1/5 MiniCPM-V 4.6 (1.3B) is now live High-res visual processing, optimized for consumer-grade and mobile hardware. We’ve leveraged the latest LLaVA-UHD v4 technique to cut vision encoding costs by 55%, enabling native edge deployment with

by · backlist 2026-05-11 · rubric 91.0
50.

(x.com)

2/ @jess__yan + i showed a toy example of Outcomes: i had an Managed Agent make a generative UI w/ metrics (charts, graphs) rendered as svg. i used an Outcomes loop to improve the render timing - Claude figured out various tricks (prompti

by (Lance Martin) · backlist 2026-05-11 · rubric 90.0
54.

(x.com)

We are excited to welcome @suragnair this Tuesday to present CompBioBench: A benchmark of 100 diverse tasks for evaluating agentic systems in computational biology! 2:30pm Tuesday May 12 | CoDa E160 | Stanford and Zoom

by (Stanford AI+Biomedicine Seminar) · backlist 2026-05-11 · rubric 90.0
68.

(t.co)

New at http:// makeitanimated.dev The keyboard opens, three cards fan out, the label floats up — all on the UI thread. Slash app Login input on focus animation React Native + Expo

by (make it animated) · backlist 2026-05-11 · rubric 88.0
72.

some motivating ideas: RL is structured around task completion, which maps directly onto how occupational classifications are built. Prior approaches were not. The gap between those two is large for specific occupation groups to be meanin

by (Philip Moreira Tomei) · backlist 2026-05-11 · rubric 87.0
77.

(x.com)

new research from me @METR_Evals : technical workers claim that today's AI impacts value of their work to an extraordinary degree (& growing over time). of course, self-reports plausibly overestimate. the magnitudes nonetheless strike me

by (Joel Becker) · backlist 2026-05-11 · rubric 86.0