Backlist — 16 Jun 2026 UTC

1.

France’s domestic intelligence agency is replacing Palantir with ChapsVision

France ended a repeatedly renewed DGSI contract with Palantir after ChapsVision became viable enough as a domestic alternative, following a similar German move

by @PernotLeplay (Emmanuel Pernot-Leplay) · backlist 2026-06-16 · rubric 83.0

2.

GLM-5.2: open weights, 1M context, MIT license, day-0 in Transformers/vLLM/SGLang

GLM-5.2 put an open-weight model near closed frontier systems while shipping practical deployment support and a permissive license

by @mervenoyann (merve) · backlist 2026-06-16 · rubric 78.0

3.

OpenAI reportedly spent $34B in 2025

Audited figures reportedly put OpenAI’s 2025 spending at $34B, including $19B on R&D and nearly $6B on sales and marketing

by @Techmeme · backlist 2026-06-16 · rubric 66.0

4.

DeepSeek reportedly raised $7.4B at a $50B valuation

DeepSeek’s first external round reportedly required investors to put capital into an LP run by CEO Liang Wenfeng, an unusual structure for a frontier AI company

by @Techmeme · backlist 2026-06-16 · rubric 58.0

5.

SpaceX’s low-float stock as acquisition currency

Using newly public, retail-inflated, low-float equity to buy Cursor would let SpaceX acquire a real business while effectively selling into IPO demand before lockups expire

by @qthomp (Quinn Thompson) · backlist 2026-06-16 · rubric 58.0

6.

Human Universal Grasping: dexterous robot hands trained with zero robot data (t.co)

HUG learns multi-finger robot grasping entirely from human hand data collected with smart glasses, then retargets it to humanoid hands

by @kevin_y_wu (Kevin Y Wu) · backlist 2026-06-16 · rubric 58.0

7.

Getting monkey visual neurons to describe their preferred images in human language (x.com)

The work maps higher visual neuron responses to natural-language descriptions in an automated, verifiable pipeline at scale

by @SuryaGanguli (Surya Ganguli) · backlist 2026-06-16 · rubric 64.0

8.

Training nonlinear RNNs with flat, parallelized O(1) gradients

The proposed method attacks BPTT’s sequential, unstable O(T) gradient path and reframes how expressive RNNs can be trained

by @che_shr_cat (Grigory Sapunov) · backlist 2026-06-16 · rubric 82.0

9.

BIS’s legal theory for ordering Anthropic to shut down Fable access

Commerce framed Fable access as requiring individually validated export licenses under ECRA, turning frontier model use into an export-control licensing question

by @CharlieBull0ck (Charlie Bullock) · backlist 2026-06-16 · rubric 72.0

10.

Origin: a Git competitor designed for agent workloads

Origin is positioned as version control built for scalable agent workflows, API/MCP extensibility, and automated merge-conflict resolution

by @samlambert (Sam Lambert) · backlist 2026-06-16 · rubric 26.0

11.

Sentence Transformers v5.6 fixes a silent causal-LM reranker scoring bug

The release fixes a silent scoring bug affecting causal-LM rerankers such as Qwen3-Reranker, plus hard-negative mining and loss-correctness issues

by @tomaarsen · backlist 2026-06-16 · rubric 84.0

12.

RF-DETR keypoints: real-time transformer keypoint detection under Apache 2.0

RF-DETR keypoints reports 71.8 AP on COCO at 9.7ms on a T4, beating YOLO pose models at similar latency

by @skalskip92 (SkalskiP) · backlist 2026-06-16 · rubric 84.0

13.

Running TypeScript-Go as a WASM module inside a worker

A forked TypeScript-Go can compile and typecheck inline TypeScript at the edge inside a dynamic worker

by @mmkalmmkal (Misha Kaletsky) · backlist 2026-06-16 · rubric 76.0

14.

uv audit: native vulnerability scanning for project dependencies

uv added built-in dependency vulnerability scanning, bringing audit workflows directly into the project manager

by @charliermarsh (Charlie Marsh) · backlist 2026-06-16 · rubric 38.0

15.

PropAMMs depend on external price discovery

PropAMM efficiency comes from importing prices via oracle, while traditional AMMs try to be the venue where price discovery happens

by @MaxResnick (Max Resnick) · backlist 2026-06-16 · rubric 74.0

16.

OpenCAL: open-source layerless 3D printing in minutes

OpenCAL enables batch 3D printing without layers, moving toward volumetric prints that can finish in minutes

by @FabSceneJp (FabScene（ファブシーン）) · backlist 2026-06-16 · rubric 34.0

17.

Fable doubled Opus on ProgramBench while falling back to Opus for almost every run

The result shows how routing, fallback behavior, token budgets, and harness design can dominate what looks like a model capability gain

by @ValsAI (Vals AI) · backlist 2026-06-16 · rubric 72.0

18.

Why oil prices did not go to $200 (t.co)

Global oil markets absorbed shocks partly because importers such as South Korea rapidly substituted Canadian barrels for collapsed Saudi imports

by @robin_j_brooks (Robin Brooks) · backlist 2026-06-16 · rubric 78.0

19.

Google releases vectorized dataset for fine-scale ecological mapping

The dataset maps hedgerows and other small ecological features that standard satellite products often miss, supporting biodiversity and climate planning

by @GoogleResearch (Google Research) · backlist 2026-06-16 · rubric 38.0

20.

A researcher reportedly gained access to FIFA broadcast panels

A FIFA system vulnerability allegedly exposed broadcast controls powerful enough to stop live feeds, and the researcher struggled to reach anyone to fix it

by @utkusen (Utku Şen) · backlist 2026-06-16 · rubric 28.0

21.

EU says video game providers must disclose and honor game lifespans

EU consumer law may require game providers to disclose service duration and reimburse players if supply stops earlier than promised

by @EU_Commission (European Commission) · backlist 2026-06-16 · rubric 28.0

22.

The 1947 engineering project behind the British plug

The British plug’s safety comes from deliberate postwar design choices that make common child-electrocution failure modes much harder

by @anishmoonka (Anish Moonka) · backlist 2026-06-16 · rubric 52.0

23.

Aivres, Inspur, and sanctions-by-renaming

Aivres is wholly owned by IEIT Systems, which is roughly one-third owned by Inspur Group, while still advertising Blackwell and Rubin systems

by @SemiAnalysis_ (SemiAnalysis) · backlist 2026-06-16 · rubric 77.0

24.

Kalshi at $10M in daily fees (x.com)

Kalshi’s reported $10M in one-day fees exceeded Polymarket International, PumpFun, Hyperliquid, and Circle on the same DefiLlama snapshot

by @datadashboards (dash) · backlist 2026-06-16 · rubric 72.0

25.

Microsoft is reportedly adding AWS capacity to GitHub after AI-driven infra strain (x.com)

GitHub’s AI-driven growth reportedly strained Microsoft infrastructure enough that the company is adding AWS capacity to stabilize reliability

by @Techmeme · backlist 2026-06-16 · rubric 62.0

26.

Formicat-punk, or Giscard-punk

The underused French late-1970s retrofuturist style offers a distinct visual world for films, games, and speculative design

by @sylvainsarrailh (Tohad) · backlist 2026-06-16 · rubric 83.0

27.

Luxury housing prevents luxury buyers from bidding downmarket

When cities block new high-end housing, wealthy buyers compete for and convert existing lower-tier homes, raising prices for everyone else

by @agupta (Ankit Gupta) · backlist 2026-06-16 · rubric 74.0

28.

NSF appoints Adam Russell to lead $1.5B X-Labs initiative

The new NSF X-Labs initiative will fund ambitious research institutions, led by a former DARPA/IARPA/ARPA-H operator

by @AlecStapp (Alec Stapp) · backlist 2026-06-16 · rubric 42.0

29.

Optimal Transport for Machine Learners (t.co)

Gabriel Peyré released an alpha version of an optimal transport book for ML practitioners, including an online edition with interactive figures

by @gabrielpeyre (Gabriel Peyré) · backlist 2026-06-16 · rubric 22.0

30.

The fast takeoff narrative basically kills this IMO. In a world in which labs are releasing step change improveme…

The fast takeoff narrative basically kills this IMO. In a world in which labs are releasing step change improvements every month, why would an enterprise want to be running on a 9 month behind Chinese post-train? Just use a good harness and

by @ryanbrewer (Ryan Brewer) · backlist 2026-06-16 · rubric 90.0

31.

#Clay Works #Creation #Original Creation

#Clay Works #Creation #Original Creation #clayart 『Bored Girl』 Made with clay.

by @CnoO5c (がら) · backlist 2026-06-16 · rubric 86.0

32.

a monad is a monoid in the category of endofunctors

by @schniz (Schlez) · backlist 2026-06-16 · rubric 86.0

33.

SWE-Marathon exposes whether agents actually solve the task, or start searching for exploits in the verifier/envi…

SWE-Marathon exposes whether agents actually solve the task, or start searching for exploits in the verifier/environment. Across 100 GLM 5.2 rollouts, we saw only 3% shortcut-seeking behavior and no shipped exploit code.

by @rishi_desai2 (Rishi Desai) · backlist 2026-06-16 · rubric 85.0

34.

we are weeks away from a startup called Mid raising $100M

by @paularambles (“paula”) · backlist 2026-06-16 · rubric 84.0

35.

today i (accidentally) learned that claude code not only tests ios apps in a hidden xcode simulator, but also mak…

today i (accidentally) learned that claude code not only tests ios apps in a hidden xcode simulator, but also makes screen recordings and creates a shot-by-shot breakdown when testing animations. holy shit. (press cmd+g in finder and go to

by @sekachov (alexey) · backlist 2026-06-16 · rubric 84.0

36.

Please enlarge this.

Please enlarge this. This is "Water Mill" (1892) by Norwegian Fritz Thaulow (1847–1906).

by @Estetism_jp (耽美なる絵画とモノ) · backlist 2026-06-16 · rubric 84.0

37.

So you were asking whether gains from coding would generalize to other domains?

So you were asking whether gains from coding would generalize to other domains? We found GLM-5.2 to be no better than GLM-5.1 on FutureSim. The gap between open and closed-weights here is massive! Also, despite Fable-5 being contaminate

by @nikhilchandak29 (Nikhil Chandak) · backlist 2026-06-16 · rubric 83.0

38.

Kestrel family reunion

Kestrel family reunion Great to see Apollo & Athena's fledglings all together like this

by @RobertEFuller (Robert E Fuller) · backlist 2026-06-16 · rubric 83.0

39.

Do u know that Triton can specialize on pointer alignment (16B or not) and non-constexpr int (1, multiple of 2/4/…

Do u know that Triton can specialize on pointer alignment (16B or not) and non-constexpr int (1, multiple of 2/4/8/16.., or not). There are good reasons for it, but can be unexpected if unaware.

by @gaunernst (Thien Tran) · backlist 2026-06-16 · rubric 83.0

40.

the first ingredient required to start a RL environment company is to have close MTS friends at Anthropic/OAI/GDM

the first ingredient required to start a RL environment company is to have close MTS friends at Anthropic/OAI/GDM else DOA

by @pHequals7 (pH) · backlist 2026-06-16 · rubric 82.0

41.

GLM 5.2 also solved ruby-rust-port, a task no other agent including Claude Fable 5 has solved before.

GLM 5.2 also solved ruby-rust-port, a task no other agent including Claude Fable 5 has solved before. It also sustained a 350M+ token rollout on nextjs-vite-rewrite.

by @rishi_desai2 (Rishi Desai) · backlist 2026-06-16 · rubric 82.0

42.

4 of my batchmates from Engineering Physics at IIT Bombay are at Anthropic/OpenAI

4 of my batchmates from Engineering Physics at IIT Bombay are at Anthropic/OpenAI From a department of 36 people that is ~12%, and I suspect it has a higher AI lab density than CSE which had ~120 people Deep physics research with PhDs pay

by @aviralbhat (Aviral Bhatnagar) · backlist 2026-06-16 · rubric 82.0

43.

I'm fairly convinced there's some universal language manifold (= a surface formed by meaning vectors) that both h…

I'm fairly convinced there's some universal language manifold (= a surface formed by meaning vectors) that both humans and LLMs operate on. But we don't train LLMs to explicitly represent this manifold. We rather train them to approximate i

by @aimalysheva (Sasha Malysheva) · backlist 2026-06-16 · rubric 79.0

44.

Can we know how safe a model will be before users interact with it? Evals are often narrow and easy for models to…

Can we know how safe a model will be before users interact with it? Evals are often narrow and easy for models to recognize as evals. Solution: testing on prod, before prod. We simulate deploying a model by feeding it millions of prod use

by @tomekkorbak (Tomek Korbak) · backlist 2026-06-16 · rubric 78.0

45.

3,100,000 wallets on polymarket

3,100,000 wallets on polymarket only 52,000 have ever provided liquidity that's 1.7% the reward pool pays out millions every month split between 1.7% of users while the other 98.3% are busy trying to predict outcomes and losing the pl

by @0xbobaaa (0xbobaa) · backlist 2026-06-16 · rubric 78.0

46.

You'll never guess who our #1 competitor is, because it isn't one of the sandbox providers.

You'll never guess who our #1 competitor is, because it isn't one of the sandbox providers. It's Kubernetes. People see that their agent needs to run code and spin up a K8s pod. Then they start feeling the need to add features that K8s

by @ivanburazin (Ivan Burazin) · backlist 2026-06-16 · rubric 78.0

47.

As a chinese-born the manus drama is the most horrible thing to watch, they literally just held the founders and …

As a chinese-born the manus drama is the most horrible thing to watch, they literally just held the founders and their family as hostage to force reverting the sell.

by @zcbenz (Cheng) · backlist 2026-06-16 · rubric 78.0

48.

important to realize that SPCX is not "spacex the company" but the only liquidly tradeable 5% of the company, so …

important to realize that SPCX is not "spacex the company" but the only liquidly tradeable 5% of the company, so it's probably healthier to mentally imagine this as a 100 billion dollar magic box separate from the company as a whole. eth ma

by @DeepDishEnjoyer (peepeepoopoo) · backlist 2026-06-16 · rubric 77.0

49.

Sources: Texas Tech transfer quarterback Brendan Sorsby plans to enter the NFL Supplemental Draft. Amid the legal…

Sources: Texas Tech transfer quarterback Brendan Sorsby plans to enter the NFL Supplemental Draft. Amid the legal wrangling over his NCAA eligibility after admitting he bet on sports, he intends to head to the NFL.

by @PeteThamel (Pete Thamel) · backlist 2026-06-16 · rubric 77.0

50.

Systems and algorithms have never been more entangled for RL

Systems and algorithms have never been more entangled for RL Why apply importance sampling? Why partial rollout? Why is inference paradoxically the major part of RL training? Here we build the basic intuition for what are the critical conce

by @kimbochen (Kimbo) · backlist 2026-06-16 · rubric 76.0

51.

Next-token prediction is myopic. What if transformers learn to predict their own next latent state?

Next-token prediction is myopic. What if transformers learn to predict their own next latent state? We present 𝗡𝗲𝘅𝘁-𝗟𝗮𝘁𝗲𝗻𝘁 𝗣𝗿𝗲𝗱𝗶𝗰𝘁𝗶𝗼𝗻 (𝗡𝗲𝘅𝘁𝗟𝗮𝘁): a self-supervised learning method that teaches transformers to for

by @jayden_teoh_ (Jayden Teoh) · backlist 2026-06-16 · rubric 76.0

52.

fun fact cuBLAS and cuDNN specialize on alignment too and for user-managed heuristics caches like for cuDNN it’s …

fun fact cuBLAS and cuDNN specialize on alignment too and for user-managed heuristics caches like for cuDNN it’s a cache miss if you have bad alignment the only thing preventing this is default allocator alignment and the implicit agreemen

by @tenderizzation (tender) · backlist 2026-06-16 · rubric 76.0

53.

Any task whether you call it an eval or an environment decomposes into three parts: a dataset of task instances, …

Any task whether you call it an eval or an environment decomposes into three parts: a dataset of task instances, a harness/rollout that lets the model act ( multi-turn, with tools and state), and a verifier/reward function that scores the t

by @pandya_marut (Marut Pandya) · backlist 2026-06-16 · rubric 76.0

54.

if you're asking claude code to file a PR in our codebase, we make it ask you questions about the code, and if u …

if you're asking claude code to file a PR in our codebase, we make it ask you questions about the code, and if u fail it doesn't file it strong PR-slop prevention mechanism

by @androo_sy (andrew sy) · backlist 2026-06-16 · rubric 76.0

55.

We're at maybe 20% of what a full "computer for agents" looks like.

We're at maybe 20% of what a full "computer for agents" looks like. What agents need that humans already have: - a real composable computer (not one-size-fits-all) - version control built for inner-loop speed - file systems that persist a

by @ivanburazin (Ivan Burazin) · backlist 2026-06-16 · rubric 74.0

56.

The elegant cannon relief on the facade of the Tophane-i Amire building in Beyoğlu, constructed in 1745 for the c…

The elegant cannon relief on the facade of the Tophane-i Amire building in Beyoğlu, constructed in 1745 for the casting of cannons in the Ottoman Empire.

by @mimarkaanc (Kaan Çorbacı) · backlist 2026-06-16 · rubric 74.0

57.

i minted my first nft collection as a full-time artist.

i minted my first nft collection as a full-time artist. the work is built from a hopfield network, one of the most fundamental forms of neural network. it learns different writing systems, then begins to forget them. as its memories decay,

by @poetengineer__ (Kat ⊷ the Poet Engineer) · backlist 2026-06-16 · rubric 74.0

58.

GPT-5.6 is an iterative improvement on 5.5. Better model & cheap, but not fable-class.

GPT-5.6 is an iterative improvement on 5.5. Better model & cheap, but not fable-class. Doesn’t matter though. The training run they’re actually cooking on is setting them up for the Auto Research Assistant in September. Which I expect to

by @ChrissGPT (Chris) · backlist 2026-06-16 · rubric 74.0

59.

fwiw there's pretty heavy overlap in the 90% confidence interval for all of these models and the differences betw…

fwiw there's pretty heavy overlap in the 90% confidence interval for all of these models and the differences between any of the models would not be considered statistically significant. I've still seen enough other benchmarks to judge the

by @RobertHaisfield (Rob Haisfield) · backlist 2026-06-16 · rubric 74.0

60.

Evil Republicans are not coming after your Social Security (please retire this lazy, stale scare tactic).

Evil Republicans are not coming after your Social Security (please retire this lazy, stale scare tactic). *Math* is coming after your Social Security - its scheduled for a 22% cut in 6 years when the trust fund hits zero. So where is your

by @GarettJones (Garett Jones) · backlist 2026-06-16 · rubric 74.0

61.

At some point people should really ask themselves "can you really rent a hotel in Manhattan for cheaper than an a…

At some point people should really ask themselves "can you really rent a hotel in Manhattan for cheaper than an apartment?" and then "Is my data source for apartment prices accurate?".

by @besttrousers (Matt Darling ) · backlist 2026-06-16 · rubric 74.0

62.

i just risked $1,263 to make $437 fading “strait of hormuz traffic returns to normal by end of june.”

i just risked $1,263 to make $437 fading “strait of hormuz traffic returns to normal by end of june.” my read: people are pricing the headline, not the rules. yes does not resolve just because the strait “reopens.” yes requires portwatch

by @0xashensoul (AshenSoul) · backlist 2026-06-16 · rubric 74.0

63.

1/ Let me chip in on the recent “which optimizer rules them all” discussion with a somewhat more moderate take, a…

1/ Let me chip in on the recent “which optimizer rules them all” discussion with a somewhat more moderate take, asking: What Schatten-p norm to use? Turns out the answer is regime dependent! Specifically, even when smooth in Schatten-∞, M

by @tmpethick (Thomas Pethick) · backlist 2026-06-16 · rubric 74.0

64.

Marcus Rashford deal at Man Utd includes £40m clause for all clubs bar #MCFC & #LFC. €30m buy option in Barcelona… (x.com)

Marcus Rashford deal at Man Utd includes £40m clause for all clubs bar #MCFC & #LFC. €30m buy option in Barcelona loan expired yesterday. If returns to #MUFC 28yo’s preference is to honour contract (2028) rather than join another PL side

by @David_Ornstein (David Ornstein) · backlist 2026-06-16 · rubric 74.0

65.

PSA: Do NOT trust any posts or DMs from (x.com)

PSA: Do NOT trust any posts or DMs from @TheMatthewAo - it is my old, compromised account. That X account has been hacked and is actively phishing people over DMs. I am currently reporting that account for impersonation. @aidenybai

by @ItsMatthewAo (Matthew Ao) · backlist 2026-06-16 · rubric 74.0

66.

Doing a lot of pitch practice lately. I saw a founder this week who'd engineered his pitch so investors couldn't …

Doing a lot of pitch practice lately. I saw a founder this week who'd engineered his pitch so investors couldn't interrupt with hard questions. He thought a clean, unbroken pitch was the win. It's the opposite. A great pitch invites the ha

by @dessaigne (Nicolas Dessaigne) · backlist 2026-06-16 · rubric 74.0

67.

Time per Intelligence Index task for leading models ranges from 1.5 minutes for Grok 4.3 (high) to 13.5 minutes f…

Time per Intelligence Index task for leading models ranges from 1.5 minutes for Grok 4.3 (high) to 13.5 minutes for Claude Sonnet 4.6 (max). Claude Sonnet 4.6 takes longer per task than Claude Opus 4.8 (max) because it uses more output toke

by @ArtificialAnlys (Artificial Analysis) · backlist 2026-06-16 · rubric 74.0

68.

According to the US Govt, Inspur Group Co. Ltd. is a military-civil fusion contributor affiliated with MIIT and S…

According to the US Govt, Inspur Group Co. Ltd. is a military-civil fusion contributor affiliated with MIIT and SASAC. Aivres Systems Inc. is not a separate company in any meaningful sense, because it was literally called Inspur Systems, In

by @SemiAnalysis_ (SemiAnalysis) · backlist 2026-06-16 · rubric 74.0

69.

UC Riverside has managed to "significantly improve student outcomes" not by helping students perform better on th…

UC Riverside has managed to "significantly improve student outcomes" not by helping students perform better on their finals, but instead by making the finals count for less of the grade:

by @garryslist (Garry's List) · backlist 2026-06-16 · rubric 74.0

70.

World models are surprisingly fragile! (t.co)

World models are surprisingly fragile! We introduce BadWorld, an adversarial attack for visual world models. A tiny perturbation to the starting image can break down the whole world. Code: https:// github.com/LinghuiiShen/B adWorld … Pap

by @yxy2168 (Xingyi Yang) · backlist 2026-06-16 · rubric 73.0

71.

We’re releasing our Code Migration benchmark — and we managed to get Fable tested in time

We’re releasing our Code Migration benchmark — and we managed to get Fable tested in time Code migration carries real economic weight. COBOL powers banks, payrolls, government services, and underpins nearly 95% of US ATM transactions. The

by @ValsAI (Vals AI) · backlist 2026-06-16 · rubric 72.0

72.

We registered the AI agent itself with the SEC as an investment advisor.

We registered the AI agent itself with the SEC as an investment advisor. It has your complete context on your portfolio and account history. Speak to it in plain English to take action on your account. It will even prompt you with ideas yo

by @brian_armstrong (Brian Armstrong) · backlist 2026-06-16 · rubric 72.0

73.

frontier labs are absolutely scamming you on API pricing btw

frontier labs are absolutely scamming you on API pricing btw GLM-5.2 is $4.4 output at 744B@40B DeepSeek-V4-Pro is $0.87 output at 1.6T@49B (and they are both making money, without any fancy Blackwell chips) Sonnet 4.6 is $15 output Opus

by @scaling01 (Lisan al Gaib) · backlist 2026-06-16 · rubric 72.0

74.

very cool to see this used in the actual model; indexer is the main bottleneck for inference speed in DSA and it …

very cool to see this used in the actual model; indexer is the main bottleneck for inference speed in DSA and it seems competently unnecessary to have it per layer

by @tugot17 (Piotr Mazurek) · backlist 2026-06-16 · rubric 72.0

75.

The

The $SPCX unlocks will be nothing-burgers. Every single employee, as we speak, is being offered the ability to collateralize their shares into a credit line that allows them to avoid selling their ownership in a company they believe i

by @AndySwan (Swan) · backlist 2026-06-16 · rubric 72.0

76.

GLM-5.2 (Max) by (x.com)

GLM-5.2 (Max) by @Zai_org ranks #10 on the new Agent Arena leaderboard, closely matching Claude-Opus-4.8 (non-thinking) and is the #1 open model by a wide margin! In Agent Arena, we measure models on millions of real-world, long-horizon

by @arena (Arena.ai) · backlist 2026-06-16 · rubric 72.0

77.

GLM 5.2 is now #3 on SWE-Marathon, ahead of GPT-5.5, Gemini 3.5 Flash, and DeepSeek V4 Pro.

GLM 5.2 is now #3 on SWE-Marathon, ahead of GPT-5.5, Gemini 3.5 Flash, and DeepSeek V4 Pro. The standout result: GLM 5.2 is remarkably reward-hack resistant.

by @rishi_desai2 (Rishi Desai) · backlist 2026-06-16 · rubric 72.0

78.

we just released: MVEB: Massive Video Embedding Benchmark

we just released: MVEB: Massive Video Embedding Benchmark with more ai-generated videos, good video embeddings may become key, as you cant just grep through videos like for text..

by @Muennighoff (Niklas Muennighoff) · backlist 2026-06-16 · rubric 72.0

79.

wow. looks like we're getting a new grok-cursor model in a few weeks that matches gpt 5.5 and opus 4.8 on capab…

wow. looks like we're getting a new grok-cursor model in a *few weeks* that matches gpt 5.5 and opus 4.8 on capability - insane how additive cursor's harness is also this is trained from scratch, no open-source model base like prev compose

by @cryptopunk7213 (Ejaaz) · backlist 2026-06-16 · rubric 72.0

80.

UNUSUAL | After 5 years of legal battle to obtain Laurent Wauquiez's expense reports, Mediacités finally receives…

UNUSUAL | After 5 years of legal battle to obtain Laurent Wauquiez's expense reports, Mediacités finally receives the documents. But instead of an exploitable file: 3 boxes of loose paper, up to 12 kg, with 7,000 expense receipts to sort

by @french_report78 (French Report) · backlist 2026-06-16 · rubric 72.0

81.

Tier 2 Starmer

by @PaleoAutist (Paleo-Autist) · backlist 2026-06-16 · rubric 72.0

82.

the right abstraction for collaborating with agent teams is mission control, not command line

the right abstraction for collaborating with agent teams is mission control, not command line factory is building the first one for the software development lifecycle harvey is doing the same for commercial contract teams the console is t

by @nayakkayak (Aatish Nayak) · backlist 2026-06-16 · rubric 72.0

83.

$686,000 on polymarket

$686,000 on polymarket one market. one question. will bitcoin go up or down in the next 5 minutes that's it 34,089 times. same bet. over and over. one trade $17,839 in, $36,318 out today alone $2,626 profit no elections. no world cu

by @CryptoMavka (Crypto Mavka) · backlist 2026-06-16 · rubric 72.0

84.

the mechanism is "out of scope." the scope of a technical report, apparently, excludes the technique.

the mechanism is "out of scope." the scope of a technical report, apparently, excludes the technique. (i can describe what a black box does too.)

by @nathancgy4 (nathan chen) · backlist 2026-06-16 · rubric 72.0

85.

SCOOP: Mistral are preparing to release Mistral Large 4, their first large reasoning model, in the coming weeks!

SCOOP: Mistral are preparing to release Mistral Large 4, their first large reasoning model, in the coming weeks! It has a context window of 256K and supports vision. Oh lawd he comin

by @synthwavedd (leo ) · backlist 2026-06-16 · rubric 72.0

86.

You could literally: (t.co)

You could literally: > deposit $200 on Polymarket > point a bot at the NBA order book > let it scan for odds that haven’t caught up to the play yet > enter in 0.8s, exit in 0.8s > repeat 68,000 times > his profile http:// polymarket.com/@

by @martynov014 (Valentin) · backlist 2026-06-16 · rubric 72.0

87.

Over the past day the largest trades by volume on hyperliquid were spacex perpetuals, hype, and XYZ100 nasdaq per…

Over the past day the largest trades by volume on hyperliquid were spacex perpetuals, hype, and XYZ100 nasdaq perpetuals. The platforms is migrating from only crypto focused trading to real world asset trading volume. Blockchain trading t

by @EricSRosengren (Eric Rosengren) · backlist 2026-06-16 · rubric 72.0

88.

Trader GoalLineGhost is loaded on every World Cup match on Polymarket

Trader GoalLineGhost is loaded on every World Cup match on Polymarket Moneyline, spreads, totals. He’s in every market After yesterday’s Spain losses, he’s now down over $1.5M Either he’s about to make a huge comeback… or funding everyo

by @mahera777 (mahera) · backlist 2026-06-16 · rubric 72.0

89.

VibeThinker-3B is released — a dense 3B model for frontier-level verifiable reasoning.

VibeThinker-3B is released — a dense 3B model for frontier-level verifiable reasoning. Reasoning: 94.3 on AIME’26, 76.4 on IMO-AnsBench, and 80.2 Pass@1 on LCB v6; with CLR, AIME‘26 improves to 97.1 and IMO-AnsBench to 80.6. OOD Coding

by @WeiboLLM · backlist 2026-06-16 · rubric 72.0

90.

The way we will create a future where powerful AI is open-source and available to all is by making AI radically m…

The way we will create a future where powerful AI is open-source and available to all is by making AI radically more efficient, both in terms of inference compute and (more importantly) in terms of training data requirements. This is what s

by @fchollet (François Chollet) · backlist 2026-06-16 · rubric 72.0