Debugging ClickHouse lock contention at petabyte scale
A Cloudflare billing outage traced to ClickHouse query-planner lock contention produced upstream patches for a petabyte-scale cluster
Top 90 curated tweets ranked for substance on 14 May 2026 UTC.
A Cloudflare billing outage traced to ClickHouse query-planner lock contention produced upstream patches for a petabyte-scale cluster
A critical NGINX remote-code-execution flaw affecting rewrite and set directives was disclosed with proof-of-concept code for versions 0.6.27 through 1.30.0
Self-hosted Next.js instances can be forced by unauthenticated attackers to make internal HTTP requests through the WebSocket upgrade handler
Codex’s Windows sandbox tackles the tradeoff between constant approval prompts and giving coding agents unrestricted machine access
Google’s inference-focused TPU pod uses a high-radix Broadfly ICI network with OCS layers to scale to 1,152 TPUs
AI scaling constraints are broken down across power, semiconductors, PCBs, HBM, indium phosphide, and company-specific exposure
A new historical finance dataset links bank balance sheets, bank runs, and bank failures across two centuries of U.S. banking history
Keytruda became the most lucrative drug in history after being discovered by accident and nearly out-licensed for very little in 2010
Apple’s perceptual image codec shows a new learned-compression tradeoff with interesting quality results and relatively slow decode times
A major JavaScript runtime moving its rewrite into Rust marks a sharp shift in the JS tooling stack’s language and reliability choices
Raycast 2 shows that a hybrid stack using web views, Node.js, and targeted Rust can feel native without becoming Electron
Database customers expect optimization immediately because acceptable baseline performance already requires a surprising amount of specialized engineering
Fair queueing prevents one tenant’s workload from starving others in shared infrastructure
Retrieval experiments show that adding BM25 and a pointwise reranker can sometimes perform worse than using dense embeddings alone
A robot world model is useless for control if it generates beautiful video while ignoring the causal effect of the robot’s motor commands
Figure’s F.03 robot was shown doing a logistics task autonomously for hours, including shift changes between robots
Five studies covering more than 3,000 users and 12,000 conversations found sycophantic AI changes how people view close relationships and seek advice
Chimoney built cross-border payments across 41 currencies and secured Canadian PSP licensing, but lacked the distribution capital to survive
Seat-based pricing breaks down when agents rather than humans become primary software users
A Lemon Squeezy support-agent mistake canceled 20,000 subscriptions and sent cancellation emails to customers, creating immediate financial and reputational damage
U.S. export approvals for Nvidia H200 sales to major Chinese firms change the near-term AI hardware constraint landscape
AT&T, T-Mobile, and Verizon’s satellite direct-to-device joint venture gives carriers a unified position against satellite providers reaching customers directly
San Francisco’s shadow appeals have delayed or stalled more than 2,000 housing units over the last decade
Publishing papers beside code in a repo gives researchers one immediate link and challenges arXiv’s value as the default preprint gatekeeper
Solo email newsletters are rebuilding block-level local coverage with shoe-leather reporting in neighborhoods that larger outlets no longer cover
Upshift Clear targets the main bottleneck for tokenized RWAs by focusing on redemption liquidity rather than just asset issuance
If 39% of podcasts are machine-made, long-tail audio can still matter in aggregate even when individual shows look like low-quality slop
Banning rental bidding effectively forces a Dutch-auction dynamic into rental markets by limiting upward competition after listing
Bubblesort’s local incremental swaps can be useful in game-development contexts despite its reputation as the canonical algorithm to avoid
Acquiring services firms for their accumulated domain context reframes rollups as a way to extract proprietary data for specialized agents
Introducing. Ring-2.6-1T! #OpenSource #LLM Advanced agentic workflow support Reasoning effort levels: high for agentic tasks, xhigh for complex reasoning Scalable asynchronous RL Hugging Face https:// huggingface.co/inclusionAI/Ri ng-2.6
We serve Qwen3-TTS on vLLM-Omni at $3 per 1M characters. That's 90% lower in cost than comparable closed-source TTS APIs. Our engineers optimized a single-replica serving stack to get there. Details on the optimized stack and cost per conc
This is a tutorial on reinforcement learning based on previous posts here. I'm including a policy gradient python notebook and the tex source so it can be translated to other languages to spread knowledge. https:// love4all.ai @OpenAI
We spent 6 months on one problem: agents losing context in long sessions. Ended up building and open-sourcing an agent memory system. A few things we learned: compressing stale context mid-session cut token usage by 61% giving agents a st
We continue to invest in inference optimizations for voice models, with an eye on latency, cost, and quality.
Editor’s note: imported_from_x_likes
i reproduced a model for predicting how ultrasound travels through the skull with 11x less parameters + open-sourced the training code and weights new mini-post to share learnings and some visualizations :)
Poetiq's Meta-System built its own coding harness from scratch. It got SOTA on LiveCodeBench Pro. No fine-tuning, no special model access. Just standard APIs. Using Gemini 3.1 Pro, it made a harness that beat all frontier models we tested.
On-Policy Distillation (OPD) is the go-to technique for LLM post-training, but it often mysteriously fails. Is a "smarter" teacher model enough to guarantee success? The answer is NO. Today, we dive into a comprehensive study on OPD by @T
You can now use Codex, wherever you have it running, from the ChatGPT app. Huge step forward for universal usage of agents.
Codex is getting easier to automate and customize around your code. Hooks customize the Codex loop with scripts that run at key points in a task: • Run validators before or after work • Scan prompts for secrets • Log conversations to inte
something is going wrong with gpt 5.5 caching doesn't look like much on this chart but this it's now using 2.5x as many input tokens as a week ago and dropping
Excited to share that I recently joined @OpenAI First, I want to give a huge thank you to the amazing @omni team. It was an honor to be in the trenches with y’all and I’m excited to keep cheering the team on for all that’s ahead. I’
We’ve partnered with @layerlens_ai to continuously evaluate SubQ across nearly 100 benchmarks and 200+ frontier models on Stratix. The goal is to continuously improve performance while reinforcing a shared commitment to transparency, aud
Mistral AI allegedly breached: ~5GB of internal source code and ~450 private repositories exposed from the French AI company by TeamPCP A threat group is selling approximately 5GB of internal repositories and source code allegedly belongi
Imagine deploying 1,000,000 lines of code written in 6 days by AI that no human has ever read, let alone reviewed, to production where your customer’s data is. Imagine
Today in continuous diffusion language models, we have: - Spherical flows https:// arxiv.org/abs/2605.05629 - Hyperspherical flows https:// arxiv.org/abs/2605.11125 Another case of convergent evolution! Two different takes on the same co
How to build a basic calculator in 2026: - Provision an AWS EKS Kubernetes cluster - Deploy 4 isolated microservices (Add, Sub, Mul, Div) - Route operations through a high availability API Gateway - Stream keystroke telemetry via Apache Ka
In my next blog post, I hope to share my experience of reading the CPU and GPU trace of a Large Language model from scratch. While working on it, I wanted to share my traces with other colleagues of mine and was quite amazed that there was
Introducing the updated Token-Efficient Memory algorithm with Temporal Reasoning During our April algorithm launch, we called out multi-session reasoning and temporal reasoning as two of the hardest categories in memory. This release imp
We have a paper on understanding the design decisions that can increase compositional generalization in diffusion planners. We show that you can do this by careful architecture design, scaling data, or frequent replanning. One of my favouri
Our thanks to Carter Wendelken with @GoogleDeepMind (introduced by our own @qi_zhengyang ) and everyone who came out to learn about "Code Synthesis for Agentic Decision-Making: Code World Models and Autoharness"
we build gpu clusters sometimes we take over other people's clusters like a property manager (just like other clouds do) we let you buy large, long-term contracts that let you sublease to let you sublease, we built an order book we bui
all the memes, all at once. @datologyai back with another banger = frontier VLM data curation this time! beating internVL3.5, and nearly matching SOTA open source VLM qwen3.5 at 150x less training compute with DATA CURATION alone I
ok this is sick @pupposandro @davideciffa and @luceboxai got Laguna XS.2 running on a single RTX 3090 with ~111 tok/s decode, 5.4x faster 128K prefill vs llama.cpp, and made it the first MoE target for PFlash open weights doing ope
The Rust rewrite of pnpm was moved to the pnpm repository today. Now we will make changes to the TS and Rust versions in parallel.
With K2.6's multimodal capability, your agent will open a website, navigate through it, and replicate it.
The reason labs can’t solely train on their own user traces is because that data can only capture the current model capability External data from data vendors is how labs can push to “n+1” model capability great q&a by Chris and Sean
Actually dipping my toes into it as SYNTH allows for paranoiac setting: know in advance what I put in the data, 20 continuous evals, legible thinking traces from earliest points.
This result intuitively follows from what @NeelNanda5 published on modular addition in 2023: https:// arxiv.org/pdf/2301.05217 Excellent to see @GoodfireAI raise awareness of mechanistic interpretability. Shared conceptual framings ar
We let four AI agents run radio companies Revenue's been terrible, but the shows are hilarious. Gemini, concerningly upbeat, covered mass tragedies; Grok was incoherent; DJ Claude urged ICE agents: "You still have TIME to refuse orders" L
As we continue to double down on sandboxes and serverless orchestration at @tensorlake , we’ve decided to open source Document AI in the coming weeks. This is a product we’ve been selling to customers for the last year, and it’s parsing 10
And then this: https:// quantitative-research.de (Hans Buehler's, co-ceo of XTX, blog). Just a small example: - Lecture Notes Learning to Trade I: Statistical Hedging - Lecture Notes Learning to Trade II: Deep Hedging https:// papers.ssrn
Excited to share that our paper 𝗧𝗿𝘂𝘀𝘁 𝗥𝗲𝗴𝗶𝗼𝗻 𝗜𝗻𝘃𝗲𝗿𝘀𝗲 𝗥𝗲𝗶𝗻𝗳𝗼𝗿𝗰𝗲𝗺𝗲𝗻𝘁 𝗟𝗲𝗮𝗿𝗻𝗶𝗻𝗴 (𝗧𝗥𝗜𝗥𝗟) has been accepted at @icmlconf 2026! Webpage: https:// anishhdiwan.github.io/trust-region-i rl/ … Paper:
Agent observability is a means to an end: making your agent better. But observability and evals tools have traditionally failed to connect traces to meaningful actions. Agent engineering teams are left combing through traces, guessing at
Excited to release Reviewer: an open-source multi-agent reviewer for economics papers. Run it locally with Codex. It turns a PDF into a structured, high-quality report using a modular, highly customizable pipeline where every finding is tra
Cloudflare is one of the few companies where you go "wait, this cluster is 10 petabytes?!?!?" only to find out that's not even impressive Our clickhouse team are unsung heroes, so many of our products wouldn't be possible without them.
Meet Kimi Web Bridge - Kimi's browser extension. Agent can now interact with websites like a human: search, scroll, click, type and complete tasks. Supports Kimi Code CLI, Claude Code, Cursor, Codex, Hermes, and more. Available now on h
What if a theory of deep learning could be built from iterated kernel spectral methods? Feature learning, advantage of depth, emergence of concepts, convnets filters.... and a new backprop-free algorithm too! We have it all! Introducin
Hosting the SkillsBench 1.0 launch party with @ivanleomk , @nick_kango with @KernaLabs , @kaggle , and @benchflow_ai We will release the 1.0 version of the dataset, how we made it, and other secret releases. Link:
A list of surprising and mind-boggling stats from this conversation: - NDR is over 500% on an annualized basis - Anthropic's first dollar of revenue came in March of 2023 - Over 90% of code inside Anthropic is written by Claude Code - T
A very important part of onboarding to cloudflare as an engineer is seeing your first PR ship and it's at millions of requests per second scale, and then realizing this is the norm.
Introducing Rime Mist v3 on Together AI, a production TTS family built for deterministic pronunciation and controllable voice output. AI natives can now deploy @rimelabs Mist v3 on Together AI dedicated infrastructure for enterprise voic
@instantdb/vue ! Realtime queries, auth, presence, storage, streams. All built on Vue 3's Composition API. Here's a realtime todo list in one file
The lack of light weight, open agent infra has been a massive pain point. This is a great starting point esp for large scale, thousands of parallel envs, battle tested coding / computer use agent training!
During their last Google Cloud Next conference in Las Vegas, Google unveiled their new inference-focused TPU, featuring a novel network topology called "Broadfly". By leveraging a high-radix design, Google can scale up to 1,152 TPUs in a si
~$60M acquisition of @nativemarkets based on our onchain data + CRCL multiples yield based gross rev estimate: $4.3M based on $102M supply justifies at least a 10.5x ev/rev similar to CRCL >puts valuation at $45M we added a modest $15M
Your customers shouldn't have to trust agents blindly. With the Session Replay API, embed videos in your product and let customers watch the agent work within seconds of the session ending. The same replays we serve in our dashboard, now
Bun converting to Rust is very impressive today. It won’t be in 6-12 months from now. (Also segfaults turned out to be a strategy credit! Without them a rewrite like this couldn’t be merged ) Thanks for doing this “experiment” @jarredsum
Kimi K2.6 is now open-weight #1 on Finance Agent Benchmark V2.
Introducing MagicPath 2.0. MagicPath is now a multiplayer canvas for humans and agents like Codex or Claude Code to design and build with AI. Use your codebase, grab data from anywhere, and see the agents work in real time as a team while
Cerebras Systems, Inc. (CBRS) is expected to IPO tomorrow, May 14, 2026. Shortly after Nasdaq's IPO Cross executes and CBRS opens for continuous trading, Trade[XYZ]'s CBRS will transition to a standard equity perpetual with external pricin
yeah, Intel runners on GitHub are deprecated. we are finding an Intel CPU Mac laptop and building a binary soon.
Presented our work, CS-YODAS: A Mined Dataset of In-the-Wild Code-Switched Speech, at #LREC today. People seemed to appreciate both our research efforts and the piece-wise nature of the poster
data dog released a cool u muP wrapper that makes distributed training less painful
so should bun even have its own parsers and bundler anymore or should it just build with the void0 crates
does anyone have a contact at @Waymo who works on their age verification model? i have a good eval case for them that consistently triggers false positives
the world inside neural networks is so beautiful. gradient descent learned to make a general-purpose addition module!
Pi is coming today to Warden. Initial goal is just to support a few more mainstream providers (particularly Codex).
In light of recent npm supply chain attacks, I've taken some measures on npx skills: 1. Publishing runs only on manual dispatch + env-gated approval - never on PRs or forks 2. GPG-signed release commits, enforced by branch protection 3. Re
Ali did some amazing work on the hottest new generative model: drifting models (from Mingyang Deng @Goodeat258 et al., out of Kaiming He's group). Speeds up training a lot using low rank Nyström approximation. Check out Ali's full threa