Backlist — 16 May 2026 UTC

Top 90 curated tweets ranked for substance on 16 May 2026 UTC.

32.

the "Mythos 69%" headline is misleading GPT-5.5 Codex sits at 41%, looks close. on T1 (full control, the only tier that matters): Mythos: 16 envs GPT-5.5 Codex: 2 Opus 4.7: 0 9x gap, not 1.7x. baseline Mythos (no hints) hits MORE T1s th

by (Chetaslua) · backlist 2026-05-16 · rubric 92.0
36.

LMFAO. Opus 4.7 just wrote me a fix for a bug, wrote tests, wired it all, it all passed. Great. then it goes - same session; "now let me revert all my changes as a sanity check and verify that the tests no longer pass" BRO WHAT.

by (keith) · backlist 2026-05-16 · rubric 88.0
39.

(x.com)

What a beautiful sight @CloudflareDev finally solved the $25/day durable object cost issue a 3-second setInterval logic in YJS awareness that was stopping the durable object from hibernating cons of building your own real-time multip

by (Anurag Bhagsain) · backlist 2026-05-16 · rubric 86.0
44.

(x.com)

Big inference day with @tuhinone from @baseten at MS&E 435 yesterday (and the Cerebras IPO that AM) We discussed compute scarcity amid the rising inference demand, model routing, & open source! Video soon.

by (Apoorv Agrawal) · backlist 2026-05-16 · rubric 85.0
54.

(x.com)

Fun visit to the @DeltaInstitutes podcast where we discussed all things agent improvement ranging from data to evals to post-training! Sleeper show. Honored to be featured amongst some technical beasts

by (Alex Shan) · backlist 2026-05-16 · rubric 82.0
63.

(x.com)

Multimodal AMIE now published @NatureMedicine ! This is from past work @GoogleDeepMind where we studied patients uploading images during diagnostic dialogue. We found that a multimodal reasoning harness that tracks a patient’s state gre

by (Khaled Saab) · backlist 2026-05-16 · rubric 78.0
70.

複数AIエージェントに強化学習(RL)をかける方法を体系化した論文が出た(https://arxiv[.]org/html/2605.02801v1)。 単一AIのRLをそのまま複数AIに使うと失敗する。理由が面白い。 最大の原因は「クレジット拡散」(credit

by (Itaru Tomita / 冨田到) · backlist 2026-05-16 · rubric 78.0
73.

(x.com)

. @bulktrade solves this with structural priority queues. Sequenced within native solana validators by order type and fifo 1. Cancels 2. Maker 3. Market, limit, ioc 7ms+ time gap between maker and taker

by (Cory L) · backlist 2026-05-16 · rubric 78.0
76.

(x.com)

I'm joining @GraySwanAI 's R&D team this Summer to work on AI security for frontier models. Excited learn from the best in LLM attacks and defenses!

by (Yixiong Hao) · backlist 2026-05-16 · rubric 78.0