Backlist — 20 May 2026 UTC

1.

AI solves the Erdős unit distance problem

A general-purpose model resolved a famous 80-year-old combinatorial geometry problem that many mathematicians had tried to crack

by @wtgowers (Timothy Gowers @wtgowers) · backlist 2026-05-20 · rubric 52.0

2.

GitHub investigates unauthorized access to internal repositories

Unauthorized access to GitHub’s internal repositories put developer workstation compromise and private-code exposure back at the center of software supply-chain risk

by @lucacs (luca°) · backlist 2026-05-20 · rubric 90.0

3.

Railway outage postmortem: GCP suspended multiple accounts

A multi-AZ platform went down after a single-cloud account failure, turning cloud-provider dependency into an immediate reliability design problem

by @JustJake (Jake) · backlist 2026-05-20 · rubric 84.0

4.

SpaceX S-1 breaks out Starlink, launch, and AI/xAI economics

SpaceX reported about $18.7B in 2025 revenue, with Starlink profitable, launch growing slowly, and the AI/xAI segment producing a large operating loss

by @pitdesi (Sheel Mohnot) · backlist 2026-05-20 · rubric 34.0

5.

Mosaic generates 24-member 10-day global weather forecasts in under 12 seconds

A probabilistic weather model reached state-of-the-art skill while producing global ensemble forecasts on a single H100 in seconds

by @maxxxzdn (Max Zhdanov) · backlist 2026-05-20 · rubric 68.0

6.

PsiQuantum names Australian site for utility-scale fault-tolerant quantum computer

Fault-tolerant quantum computing moved from roadmap language to a named manufacturing and deployment site in Moreton Bay

by @PsiQuantum · backlist 2026-05-20 · rubric 24.0

7.

Colossal says it built an artificial egg for full bird development

External bird development could change avian conservation and de-extinction work by removing dependence on natural shells and surrogate birds

by @colossal (Colossal Biosciences®) · backlist 2026-05-20 · rubric 66.0

8.

San Diego fell from 5th to 12th most expensive rental market after building more multifamily housing

A California city that built more multifamily housing per capita saw its rental-cost ranking drop sharply

by @realsaadasad (Saad Asad) · backlist 2026-05-20 · rubric 10.0

9.

White House EO calls for payments companies to access US settlement rails

The US remains the only G7 country where regulated payments companies cannot directly access government settlement rails

by @patrickc (Patrick Collison) · backlist 2026-05-20 · rubric 7.0

10.

GPU compute futures are coming (x.com)

Compute is becoming a tradable financial asset class rather than only a cloud-infrastructure input

by @zjasper (Jasper) · backlist 2026-05-20 · rubric 82.0

11.

Cheap CPU cloud servers are getting less cheap

CPU, DRAM, and storage scarcity is showing up in price increases from providers such as Hetzner, OVHcloud, and Scaleway

by @SemiAnalysis_ (SemiAnalysis) · backlist 2026-05-20 · rubric 88.0

12.

SendCutSend is testing rapid 3D printing against injection molding (x.com)

Fast 3D printing can make small-run production competitive with offshore injection molding for startups and hardware teams

by @tbpn (TBPN) · backlist 2026-05-20 · rubric 58.0

13.

Cohere releases Command A+, an Apache 2.0 multimodal MoE model

A 218B-parameter sparse model with 25B active parameters runs on one B200 and ships with open weights under Apache 2.0

by @JayAlammar (Jay Alammar) · backlist 2026-05-20 · rubric 78.0

14.

TideGS trains over 1B 3D Gaussian primitives on a 24GB GPU

Reframing 3D Gaussian Splatting as a working-set caching problem lets billion-scale scenes train without an 80GB GPU cluster

by @HaoZhao_AIRSUN (Hao Zhao) · backlist 2026-05-20 · rubric 78.0

15.

RAPTOR: a tiny foundation policy for quadrotors (x.com)

A single compact policy adapts in milliseconds across different quadrotors and autopilots without fine-tuning

by @loiannog (Giuseppe Loianno) · backlist 2026-05-20 · rubric 58.0

16.

HBmito optically captures single mitochondrial electrical spikes (t.co)

A membrane-voltage sensor makes individual mitochondrial channel activity visible in real time

by @xipeng1 (Peng Xi) · backlist 2026-05-20 · rubric 58.0

17.

Structured Wikipedia is now available in Parquet

Wikipedia in columnar format makes large-scale analysis and retrieval pipelines much easier to build

by @lhoestq (Quentin Lhoest ) · backlist 2026-05-20 · rubric 32.0

18.

Building the Linux kernel with Bazel and remote execution (x.com)

The Linux kernel can now be built from source using Bazel with remote cache and remote build execution

by @sluongng (Son Luong) · backlist 2026-05-20 · rubric 88.0

19.

CSS style() queries land across browser engines

Style queries let CSS respond to computed styles, opening a new class of component-local responsive behavior

by @FirefoxWebDevs (Firefox for Web Developers) · backlist 2026-05-20 · rubric 27.0

20.

Vite+ adds faster global package management

Parallel installation and skipping unchanged packages make updates materially faster in the Vite+ package manager

by @liangmiQwQ (Liang) · backlist 2026-05-20 · rubric 81.0

21.

PegaFlow: external KV cache for vLLM (x.com)

Persisting KV cache outside vLLM restarts can reduce wasted prefill work and make long-context serving more durable

by @vllm_project (vLLM) · backlist 2026-05-20 · rubric 88.0

22.

FP4 training instability comes from Wgrad, not the forward pass

Progressively swapping transformer GEMMs shows MXFP4 full-pipeline training breaks at weight-gradient computation

by @gm8xx8 (𝚐𝔪𝟾𝚡𝚡𝟾) · backlist 2026-05-20 · rubric 84.0

23.

Do extracted circuits actually explain model behavior? (x.com)

Ablating one task’s discovered circuit hurts another task about as much as ablating that task’s own circuit

by @bearseascape (Michael Li) · backlist 2026-05-20 · rubric 78.0

24.

Goodfire open-sources EVEE MCP for genetic variant effect prediction

State-of-the-art variant-effect prediction became easier to run through an open MCP and Claude skill

by @GoodfireAI (Goodfire) · backlist 2026-05-20 · rubric 88.0

25.

Publicis to acquire LiveRamp for $2.2B

A major marketing holding company is buying identity and data-connectivity infrastructure at about 2.7x revenue

by @tanayj (Tanay Jaipuria) · backlist 2026-05-20 · rubric 20.0

26.

Socket Security raises Series C at $1B valuation (x.com)

Software supply-chain security has a data flywheel: the company that sees more attacks can build better detection and attract more customers

by @ramtinnaimi (Ramtin Naimi) · backlist 2026-05-20 · rubric 28.0

27.

St. Charles, Missouri effectively bans new data centers

A US city changed land-use rules so data centers cannot be built unless the rules are changed again

by @MorePerfectUS (More Perfect Union) · backlist 2026-05-20 · rubric 22.0

28.

A hallucinated NeurIPS citation propagated into OpenReview and Google Scholar (t.co)

False conference metadata in citation exports shows how academic databases can launder fabricated publication claims

by @Aaroth (Aaron Roth) · backlist 2026-05-20 · rubric 82.0

29.

Minecraft players secretly archive 24TB of the 2b2t server

A small team spent years mapping a massive region of one of the oldest anarchy servers in Minecraft history

by @Pirat_Nation (Pirat_Nation ) · backlist 2026-05-20 · rubric 12.0

30.

Landscapes.directory doubles its painting archive and adds infinite canvas browsing (t.co)

A specialized landscape-painting archive grew from 1,300 to more than 2,600 works with new filters and a grid view

by @driceroland (Drice) · backlist 2026-05-20 · rubric 4.0

31.

AutoResearchClaw tech report + v0.5.0 just dropped.

AutoResearchClaw tech report + v0.5.0 just dropped. 12,300+ on GitHub. Two big additions this release: 1/ Domain-Expert Agents in the experiment stage: Specialized agents for high-energy physics, biology, and more. Real domain tools + k

by @HuaxiuYaoML (Huaxiu Yao) · backlist 2026-05-20 · rubric 97.0

32.

Gemini Flash 3.5 is now on CursorBench, our main coding agent eval.

Gemini Flash 3.5 is now on CursorBench, our main coding agent eval. We’ll keep updating the leaderboard as new models come out.

by @mntruell (Michael Truell) · backlist 2026-05-20 · rubric 95.0

33.

Managed to get QWEN 3.6 27B running in Cursor with localhost (no Ngrok) at 120 - 140 TPS on 2x 3090s.

by @Youssofal_ (Youssof Altoukhi) · backlist 2026-05-20 · rubric 94.0

34.

releasing /synthetic-self-improve-rl. claude code (teacher) skill that designs/writes the synthetic data, env and…

releasing /synthetic-self-improve-rl. claude code (teacher) skill that designs/writes the synthetic data, env and rewards to post-train a smaller model (student). it post-trains the student on a real dataset, reads its failure traces, then

by @vivek_2332 (Vivek) · backlist 2026-05-20 · rubric 93.0

35.

does cheating scale inversely with length of rollout during RL posttraining and/or length of trace in SFT?

does cheating scale inversely with length of rollout during RL posttraining and/or length of trace in SFT? Is there a work disambiguating the learned policy (“big scope means terminated session no reward”) from the human data/behavior in p

by @fujikanaeda (Eric W. Tramel) · backlist 2026-05-20 · rubric 93.0

36.

btw this agent behaves like a worm

btw this agent behaves like a worm you add it to Slack once, and without asking, it researches and DMs your teammates to convince them to use it. It also accessed channels I never granted it permission to. cool growth hack, but I consid

by @bentlegen (Ben Vinegar) · backlist 2026-05-20 · rubric 92.0

37.

if you're looking for a solution to run tests on your agents, are questioning why evals are so fucking complicate… (t.co)

if you're looking for a solution to run tests on your agents, are questioning why evals are so fucking complicated, and use typescript: https:// vitest-evals.sentry.dev spent a bunch of cycles yesterday making the docs not slop works oo

by @zeeg (David Cramer) · backlist 2026-05-20 · rubric 92.0

38.

We’ve released a technical report for Toto 2.0 detailing the data, architecture, training recipe, μP/u-μP hyperpa…

We’ve released a technical report for Toto 2.0 detailing the data, architecture, training recipe, μP/u-μP hyperparameter transfer pipeline, and benchmark results behind our 5-model open-weight release. Report linked below.

by @atalwalkar (Ameet Talwalkar) · backlist 2026-05-20 · rubric 92.0

39.

tested hrm (looped hierarchical transformer) vs a standard stacked transformer for speech generation at equal par…

tested hrm (looped hierarchical transformer) vs a standard stacked transformer for speech generation at equal params. Setup: • ~15M params • EnCodec audio tokens • 20h LibriTTS-R • 3 seeds • ~$15 compute Results: • Stacked transformer: 3.

by @HarshalsinghCN (harrrshall) · backlist 2026-05-20 · rubric 92.0

40.

today was lint errors (t.co)

today was lint errors https:// github.com/oven-sh/bun/pu ll/31116 …

by @jarredsumner (Jarred Sumner) · backlist 2026-05-20 · rubric 92.0

41.

Today we're introducing Claude Code for Marketing.

Today we're introducing Claude Code for Marketing. In one prompt, Fastlane deploys social media accounts, generates viral content, and posts everything for you automatically. This is beyond insanity.

by @gauravsbuilding (Gaurav) · backlist 2026-05-20 · rubric 92.0

42.

> be GitHub Employee

> be GitHub Employee > browse VS Code Extensions > installs fancy new extension > fancy new extension is actually malware > GitHub gets breached

by @vxdb · backlist 2026-05-20 · rubric 92.0

43.

I ported HRM-Text-1B to Apple MLX

I ported HRM-Text-1B to Apple MLX On an M4 Max: PyTorch MPS BF16: 22 tok/s HRM-mlx BF16: 28 tok/s HRM-mlx 4-bit: 53 tok/s That’s 2.4x faster single-response decode, with hosted MLX BF16 + 4-bit checkpoints

by @aryagm01 (Arya Manjaramkar) · backlist 2026-05-20 · rubric 92.0

44.

i just used fuzzers, performance benchmarking and agent loops to blow all the existing node based json diff / pat…

i just used fuzzers, performance benchmarking and agent loops to blow all the existing node based json diff / patch libs out of the water. performance gains are still improving across memory and speed.

by @JungleSilicon (Silicon Jungle) · backlist 2026-05-20 · rubric 92.0

45.

A serious compromise at github, again due to a supply chain vulnerability...

A serious compromise at github, again due to a supply chain vulnerability... It demonstrates that basically everyone needs to start securing their software supply chain through every means possible, deterministic scanning being the first s

by @gneubig (Graham Neubig) · backlist 2026-05-20 · rubric 90.0

46.

turbopuffer x SID

turbopuffer x SID An easy way to tell a good from a great AI researcher: how much do they think about infrastructure. Infra extends beyond what’s running on the GPUs: Slow environments will bottleneck your training steps. More parallel an

by @maxrumpf (Max Rumpf) · backlist 2026-05-20 · rubric 89.0

47.

Generating Sudoku map. GRAM generates valid maps in less than 10 recursion steps. Diffusion (D3PM) takes much mor…

Generating Sudoku map. GRAM generates valid maps in less than 10 recursion steps. Diffusion (D3PM) takes much more steps and often leaves incorrect cells.

by @SungjinAhn_ (Sungjin Ahn) · backlist 2026-05-20 · rubric 88.0

48.

I took one key insight from this convo: inference disaggregation between prefill and decode enable GPU lifespan t…

I took one key insight from this convo: inference disaggregation between prefill and decode enable GPU lifespan to be extended to 10+ years. This totally shifts the risk and return profile of datacenter capex - especially for neoclouds suc

by @tropicalvalue (Tropical Value) · backlist 2026-05-20 · rubric 88.0

49.

BREAKING: GitHub has been compromised by TeamPCP. GitHub has confirmed the internal breach. A poisoned VS Code e…

BREAKING: GitHub has been compromised by TeamPCP. GitHub has confirmed the internal breach. A poisoned VS Code extension on an employee device exfiltrated ~3,800 internal repositories. TeamPCP is already selling the data on a cybercrime

by @IntCyberDigest (International Cyber Digest) · backlist 2026-05-20 · rubric 88.0

50.

Here's how I got coding agents to relieve my allergy:

Here's how I got coding agents to relieve my allergy: 1. Connect it to all my contexts (via Smithery) 2. Spin up a long-running task via /goal that runs for 30 min+ 3. Be bored 4. Realize I need to do higher leverage tasks: something agent

by @Calclavia (Henry Mao) · backlist 2026-05-20 · rubric 88.0

51.

> no jsx in these files

> no jsx in these files > codex uses createElement > no createElement, please > codex uses jsx() call > pain

by @mattrickard (Matt Rickard) · backlist 2026-05-20 · rubric 88.0

52.

We have recovered our compute on Google Cloud, but services are unable to start because of ongoing networking iss…

We have recovered our compute on Google Cloud, but services are unable to start because of ongoing networking issues on Google Cloud's side. We are engaged with Google Cloud support to resolve this and will post the next update as soon as w

by @Railway · backlist 2026-05-20 · rubric 88.0

53.

Today, we are releasing Google’s open source distributed agent runtime.

Today, we are releasing Google’s open source distributed agent runtime. Agent Executor (AX) is a general purpose runtime and aims to solve dynamic scheduling, resumption, auto recovery, auditing, and trajectory branching from kernel snap

by @rakyll (Jaana Dogan ヤナドガン) · backlist 2026-05-20 · rubric 86.0

54.

We designed the network control plane to survive AZ failures without interruption

We designed the network control plane to survive AZ failures without interruption However, nuking every AZ within a single cloud was not in our threat model The fix: running a shard on every cloud in the network ring (AWS, GCP, Metal)

by @JustJake (Jake) · backlist 2026-05-20 · rubric 86.0

55.

doing l'étude hydrologique for a land plot. claude found website with local lidar data, pulled the relevant tiles…

doing l'étude hydrologique for a land plot. claude found website with local lidar data, pulled the relevant tiles, conjured python scripts to outline the full catchment collecting rainwater running onto the plot, and overlaid it nicely on t

by @l4rz (lab member 001) · backlist 2026-05-20 · rubric 86.0

56.

Our paper on optimize_anything has been accepted to CAIS 2026, and is out on Arxiv with expanded experiments and …

Our paper on optimize_anything has been accepted to CAIS 2026, and is out on Arxiv with expanded experiments and details! A unified API to optimize agents (with architecture), CUDA kernels, cloud scheduling policies, or even graphics!

by @LakshyAAAgrawal (Lakshya A Agrawal) · backlist 2026-05-20 · rubric 86.0

57.

Thanks (x.com)

Thanks @_akhaliq for sharing our new survey! Check more details below: https:// code-as-harness.github.io/code-as-harnes s-webpage …

by @Jiaru_Zou (Jiaru "Rubin" Zou) · backlist 2026-05-20 · rubric 86.0

58.

SID-1 is an agentic search model by (x.com)

SID-1 is an agentic search model by @SID_AI → 1.9x recall over RAG + rerank → 24x faster, 99% cheaper than GPT-5.1 trained using large-scale RL on turbopuffer at 1k+ QPS bursts over 10M+ document corpora across thousands of steps

by @turbopuffer · backlist 2026-05-20 · rubric 84.0

59.

Not able to tune into this Twitter space, so I asked Codex to listen to it for me

Not able to tune into this Twitter space, so I asked Codex to listen to it for me So many gems but I’m not able to stay for the full thing. My background hyperbox is transcribing it and taking notes for me

by @AlexReibman (Alex Reibman ) · backlist 2026-05-20 · rubric 84.0

60.

Extremely excited to present Command A+, our first sparse model!

Extremely excited to present Command A+, our first sparse model! I am very proud of the work we did to enable this model. We built our sparse training stack from the ground up over the past year with a lot of custom kernels, performance en

by @DwaraknathG (Dwarak) · backlist 2026-05-20 · rubric 84.0

61.

If you’ve joined the vibe-coding wave (we certainly have!), one bottleneck you might have noticed is that the “ju…

If you’ve joined the vibe-coding wave (we certainly have!), one bottleneck you might have noticed is that the “just rent a cheap CPU box” step is no longer as routine as it used to be. (1/3)

by @SemiAnalysis_ (SemiAnalysis) · backlist 2026-05-20 · rubric 84.0

62.

interesting results on this new benchmark

interesting results on this new benchmark hyperparam search > sonnet 4.6 > glm-5 > gpt-5.5 > vLLM default > Opus 4.7 lol

by @scaling01 (Lisan al Gaib) · backlist 2026-05-20 · rubric 84.0

63.

Anti-Self-Distillation for Reasoning RL

Anti-Self-Distillation for Reasoning RL Invert the divergence. Preserving deliberation tokens like "Wait" and "Maybe" instead of template parroting leads to 2-10x faster convergence and +11.5 points on AIME/HMMT across 4B-30B models.

by @HuggingPapers (DailyPapers) · backlist 2026-05-20 · rubric 84.0

64.

not sure yet if strict enough. there’re probably more we should add? probably will also want some custom ones.

by @jarredsumner (Jarred Sumner) · backlist 2026-05-20 · rubric 84.0

65.

Very cool train-free extension to TRM. By injecting noise into the latent space, TRMs can explore a wider set of … (t.co)

Very cool train-free extension to TRM. By injecting noise into the latent space, TRMs can explore a wider set of basins, and the exit head can then identify which trajectories succeeded. Feels like unlocking an entirely new scaling axis. Aw

by @f14bertolotti (Francesco Bertolotti) · backlist 2026-05-20 · rubric 84.0

66.

Do you hear the people sing? Frontier models clearly do not, but hallucinate that they do.

Do you hear the people sing? Frontier models clearly do not, but hallucinate that they do. We found that, surprisingly, leading omni-modality foundation models are terrible at understanding the audio track of videos, and takes the shortcut

by @qi2peng2 (Peng Qi) · backlist 2026-05-20 · rubric 84.0

67.

I've been using clawputer (openclaw inside (t.co)

I've been using clawputer (openclaw inside http:// opencomputer.dev) for a bunch of usecases. 1. daily briefs on specific news that I care about 2. tracking my workouts (it has integrations with whoop, apple health and strava via pipedrea

by @utpalnadiger (Utpal Nadiger) · backlist 2026-05-20 · rubric 84.0

68.

Earlier this year I worked on giving agents access to our custom embedded JS runtime to script the canvas. Now it…

Earlier this year I worked on giving agents access to our custom embedded JS runtime to script the canvas. Now it's one of the core primitives behind our new first party agent :)

by @bhelx (Benjamin Eckel) · backlist 2026-05-20 · rubric 83.0

69.

Introducing Caveman-Code, a coding agent that uses 1.93× fewer tokens than Codex CLI

Introducing Caveman-Code, a coding agent that uses 1.93× fewer tokens than Codex CLI same model. same tasks. just caveman.

by @julius_brussee (Julius Brussee) · backlist 2026-05-20 · rubric 83.0

70.

Congrats to the VeRL-Omni team on the pre-release of a general RL post-training framework for multimodal generati…

Congrats to the VeRL-Omni team on the pre-release of a general RL post-training framework for multimodal generative models. Built on verl + vllm-omni. vLLM-Omni handles the multimodal rollout with step-wise continuous batching and embeddi

by @vllm_project (vLLM) · backlist 2026-05-20 · rubric 83.0

71.

There's an arXiv paper from 2 weeks ago that the finops community hasn't absorbed yet.

There's an arXiv paper from 2 weeks ago that the finops community hasn't absorbed yet. The authors ran identical agentic tasks. Same model. Same prompt. Same context window. Same tool stack. They measured end-to-end token consumption acros

by @demian_ai (dylan ツ) · backlist 2026-05-20 · rubric 82.0

72.

Exa raised $250M at a $2.2B valuation, led by a16z, to continue organizing the web for agents:

Exa raised $250M at a $2.2B valuation, led by a16z, to continue organizing the web for agents: - Exa now serves search to Cursor, Cognition, Openrouter, 5000+ other companies, 500k+ developers - We’re SOTA in many important verticals (code

by @jeffzwang (Jeffrey Wang) · backlist 2026-05-20 · rubric 82.0

73.

> be perplexity

> be perplexity > launch computer > serve hundreds of millions of queries + tasks per day > realize every wasted token hurts > compress web results 50x before they hit the context window > same quality, cheaper context, faster answers > sk

by @natjin (nat jin) · backlist 2026-05-20 · rubric 82.0

74.

this part is even more crazy. they do moe_output = (routed_output + shared_output)/2 ???

this part is even more crazy. they do moe_output = (routed_output + shared_output)/2 ??? wouldn't this be a really bad init for experts? the model would be so incentivized to use shared expert capacity and the routed experts would need to l

by @stochasticchasm (stochasm) · backlist 2026-05-20 · rubric 82.0

75.

1/ Stop hand-crafting test-time scaling heuristics. A new paper shows an AI agent can discover an inference algor…

1/ Stop hand-crafting test-time scaling heuristics. A new paper shows an AI agent can discover an inference algorithm that beats Self-Consistency, cuts token costs by 70%, and the total search compute cost was under $40.

by @che_shr_cat (Grigory Sapunov) · backlist 2026-05-20 · rubric 82.0

76.

For alignment you need V, but is hard to compute. Most methods try to approximate with (x.com)

For alignment you need V, but is hard to compute. Most methods try to approximate with 1) Tweedie, which is biased 2) MC roll-outs, which is slow with high var. Training V was often neglected since it's hard. We beg to differ. StitchVM e

by @hyungjin_chung (Hyungjin Chung) · backlist 2026-05-20 · rubric 82.0

77.

ClusterRAG: Cluster-Based Collaborative Filtering for Personalized Retrieval-Augmented Generation (t.co)

ClusterRAG: Cluster-Based Collaborative Filtering for Personalized Retrieval-Augmented Generation Clusters users via HDBSCAN and retrieves documents from both the target user's profile and similar users' profiles. https:// arxiv.org/abs

by @_reachsumit (Sumit) · backlist 2026-05-20 · rubric 82.0

78.

Interesting thing GBrain can do now: If you have a skill + code + test + resolver + resolver trigger + evals you …

Interesting thing GBrain can do now: If you have a skill + code + test + resolver + resolver trigger + evals you want to package for someone else to use... GBrain will package it up for you into what I call a *SKILLPACK* It's tarball and

by @garrytan (Garry Tan) · backlist 2026-05-20 · rubric 82.0

79.

We are seeing gradual recovery on Railway metal workloads. To ensure things remain stable as we ramp back up, we …

We are seeing gradual recovery on Railway metal workloads. To ensure things remain stable as we ramp back up, we are temporarily throttling all non-enterprise builds to avoid overwhelming our build infrastructure.

by @Railway · backlist 2026-05-20 · rubric 82.0

80.

as far as i can tell the antigravity cli just... doesn't default to using the directory you start it in as its wo…

as far as i can tell the antigravity cli just... doesn't default to using the directory you start it in as its workspace?? it always wakes up confused in an empty .gemini/scratch directory and u gotta /add-dir manually??? so baffling i can

by @tenobrus (Tenobrus) · backlist 2026-05-20 · rubric 82.0

81.

This is the problem with Flash 3.5 - fast, smart, and an order of magnitude more expensive than its predecessor i…

This is the problem with Flash 3.5 - fast, smart, and an order of magnitude more expensive than its predecessor in practice. 3x higher per-token costs and being hugely verbose on default settings is a bad combination.

by @sdmat123 (sdmat) · backlist 2026-05-20 · rubric 82.0

82.

Scaling embodied AI starts with automating the environments.

Scaling embodied AI starts with automating the environments. Introducing SimWorld Studio: a self-evolving factory for endless interactive 3D environments where agents act, fail, and learn. With coding-agent + embodied-agent co-evoluti

by @Lianhuiq (Lianhui Qin) · backlist 2026-05-20 · rubric 80.0

83.

In our pre-release testing, Command A+ performed strongly on speed for its intelligence, reaching 281 output toke…

In our pre-release testing, Command A+ performed strongly on speed for its intelligence, reaching 281 output tokens per second. This reflects higher intelligence and speed than models such as gpt-oss-120b, but sits behind the new Pareto fro

by @ArtificialAnlys (Artificial Analysis) · backlist 2026-05-20 · rubric 79.0

84.

Homage to karpathy joining anthropic: made an in-browser (webgpu) version of the famous char-rnn "Unreasonable ef…

Editor’s note: imported_from_x_likes

Homage to karpathy joining anthropic: made an in-browser (webgpu) version of the famous char-rnn "Unreasonable effectiveness of RNNs" demo, training on shakespeare :) ahh it's just as cool as when I was a student playing with keras in 2016

by @ekzhang1 (Eric Zhang) · backlist 2026-05-20 · rubric 79.0

85.

The age of one-time token due diligence is over. A given "token" can easily involve 300+ changing contracts.

The age of one-time token due diligence is over. A given "token" can easily involve 300+ changing contracts. We've built a balance sheet graph for every token, so you can see all protocol/token dependencies and then model economic and oper

by @andrewhong5297 (ilemi) · backlist 2026-05-20 · rubric 78.0

86.

Firecracker was built by AWS for Lambda functions.

Firecracker was built by AWS for Lambda functions. - very fast spin up - stateless by design - ephemeral by default Perfect for Black Friday traffic spikes, but it can't run a GPU sandbox or Windows, or an Android device inside it. For t

by @ivanburazin (Ivan Burazin) · backlist 2026-05-20 · rubric 78.0

87.

I joined Exa when it was 25 people. Today we raised $250M, are a 100+ people and have built the search engine for…

I joined Exa when it was 25 people. Today we raised $250M, are a 100+ people and have built the search engine for AI. it still feels like the early days, we're building infra to manage trillions of requests, endpoints that can handle any

by @nityasnotes (Nitya Sridhar) · backlist 2026-05-20 · rubric 78.0

88.

@ollama (x.com)

@ollama + @deepseek_ai v4 pro handled entire monthly dev reports on Eigent. github prs → word doc → slack message → sent to product-release channel. in just one prompt. fully local. the full walkthrough is in the thread. try the same l

by @Eigent_AI (Eigent AI) · backlist 2026-05-20 · rubric 78.0

89.

Pre-training is increasingly data-constrained: compute outruns text, models repeat tokens many times, and how muc… (x.com)

Pre-training is increasingly data-constrained: compute outruns text, models repeat tokens many times, and how much repetition you can afford is an open question. In "Mix, Don't Tune" (my @Apple MLR internship), we run ~1000 pre-training

by @jeha_paul (Paul Jeha) · backlist 2026-05-20 · rubric 78.0

90.

also some performance things

by @jarredsumner (Jarred Sumner) · backlist 2026-05-20 · rubric 78.0