Backlist — 24 May 2026 UTC

Chose one representative per repeated story where possible, especially around AI math results, agent tooling, and semiconductor rumors.

28.

Next up, Anthropic on SWE Bench Pro. This is where we see some bigger jumps rather than incremental. Opus 4.6 scored 53.4%, Opus 4.7 hit 64.3%, and Mythos Preview jumped to 77.8%. • Opus 4.6: 53.4% • Opus 4.7: 64.3% • Mythos Preview: 77.8

by (Chris) · backlist 2026-05-24 · rubric 97.0
45.

New in emulate v0.6: way more Slack Emulate Slack in CI + agent sandboxes CLI or Next.js adapter Messages Threads Channels DMs MPIMs OAuth Inspector SDK tests Scopes Profiles Presence Files Uploads Pins Bookmarks Apps Modals Webhooks Eve

by (Chris Tate) · backlist 2026-05-24 · rubric 91.0
47.

(t.co)

Think ⨉ Skills --- https:// github.com/cloudflare/age nts/pull/1584 … - supports the https:// agentskills.io spec - load via local filesystem/codebase, or r2 (git coming soon? maybe) - configurable permission model - working js/python/ba

by (sunil pai) · backlist 2026-05-24 · rubric 91.0
48.

(x.com)

@saturdayrobotic & World Model Reading Club 09, Part 1 Recap, @CVPR Warm-up: keynote @tommiekerssies , hosts @junfanzhu98 , @aurorafeng_01 , @zoeytzh ​A Frame is Worth One Token: Efficient Generative World Modeling with Delta Tok

by (Junfan Zhu 朱俊帆 CVPR) · backlist 2026-05-24 · rubric 91.0
52.

Fast browser agents are getting cheap. Composer 2.5 in Pi drove Sauce Demo checkout in a headed browser and placed the demo order. 39s end-to-end 28 browser tool calls ~7.7k Pi-estimated model tokens pi-agent-browser-native v0.2.33 What

by (Mitch Fultz) · backlist 2026-05-24 · rubric 90.0
55.

I’ve left Google DeepMind. The last two years have been an incredible whirlwind. A couple years ago, I joined a small startup called Codeium. There, I got to ship Windsurf, train SWE-1 (a frontier agentic coding model), go to DeepMind in

by (Ronak Malde) · backlist 2026-05-24 · rubric 88.0
57.

(t.co)

Paper here: https:// arxiv.org/pdf/2502.12170. The MUDD coefficients are used for many purposes, such as routing multiple layers into future attention values, modulating the value embedding, modulating the bigram embedding, etc. (Delay on

by (Larry Dial) · backlist 2026-05-24 · rubric 88.0
64.

(t.co)

new aithy website https:// github.com/dosco/aithy - better secure sandbox - artifact management - tons of built in skills - new multi-service arch., - packaged deployer - mesh networking - usage tracking - permissions system - memories -

by (spacy) · backlist 2026-05-24 · rubric 87.0
85.

introducing >molly a terminal-native discord client discord is slow, heavy, and resource hungry - a browser wrapped in electron, burning your ram just to chat. molly fixes that open your terminal, type molly, and you're in discord. no br

by (puang) · backlist 2026-05-24 · rubric 84.0