Postgres and MySQL could be much faster on SSDs
Database performance still leaves large gains on the table when the I/O path is not optimized all the way from query engine to SSD
Top 90 curated tweets ranked for substance on 17 May 2026 UTC.
Database performance still leaves large gains on the table when the I/O path is not optimized all the way from query engine to SSD
Maintainers treated account lifecycle behavior as a possible precursor to a supply-chain attack even though the submitted code was not itself malicious
Chip export enforcement may gain a whistleblower market paying 10–30% of fines to people who report violations
China’s DRAM self-sufficiency is moving from aspiration to reported scale, with CXMT forecasting more than 600% year-over-year revenue growth
Long-context inference broke when the entire MLA cache was crushed to FP8, but preserving RoPE embeddings in BF16 kept quality while accelerating the rest
A solo builder reports firing a modular liquid rocket engine six times after building the engine and remotely operated test stand for about $1,000
Dynamo treats simulation as a first-class systems tool, letting researchers test large GPU traces in virtual time before touching real clusters
Secondary-market accelerator cards in China are exposing high-end FPGA silicon at prices two orders of magnitude below catalog listings
Scaling DNA foundation models has not delivered the expected breakthroughs, and metadata may be the real bottleneck for AI in genomics
Several page-cache exploit paths require readable SUID binaries, while NixOS keeps its wrapper binaries unreadable to unprivileged users
A Codex-assisted Rust rewrite reportedly preserved all rule fixtures while cutting runtime for a real JavaScript tooling workload by hundreds of times
Tensorlake’s design keeps an ext4 view for guests while deduplicating sandbox images and snapshots underneath via content addressing
Qwen’s pruning-and-distillation result suggests smaller MoE models can inherit more capability from a large pretrained model than from fresh training
A small project for learning MLIR-style dialects shows how tracing IR and emitting MSL can make GPU compiler internals more approachable
Virtualized rendering can make huge web views fast while quietly disabling basic browser affordances like find-in-page
A brand built around ethical DTC basics appears to be ending as part of the fast-fashion giant it originally defined itself against
Cheap U.S. natural gas versus expensive European gas changes the economics of steel, fertilizer, electricity, chemicals, and manufacturing competitiveness
High-volume or unusual model requests can reveal product direction and customer behavior even when a company accesses models through an intermediary
Many new prediction-market products are front ends over Kalshi and Polymarket liquidity, shifting the value capture question from market creation to distribution
A low-budget manipulation experiment shows what can be done with consumer GPU constraints and current VLA/action-cloning methods
A third forearm artery was found in about 10% of people born in the 1880s but in roughly a third or more of people born in the late 1900s
If Go exposes data-parallel primitives in the standard library, performance-oriented Go code could become less dependent on assembly and external packages
Soil microbial diversity appears to suppress pathogen spread at ecosystem scale, reframing biodiversity as infrastructure for plant and agricultural resilience
A Pi Zero 2 W has run asteroid-data BOINC workloads for a year while paraffin oil keeps the processor near room temperature under load
France’s nuclear buildout remains a rare example of a country executing an energy infrastructure plan methodically at national scale
The KDD 2026 work frames recommender stagnation as an exploration–exploitation problem and uses hyperbolic geometry to diversify exposure
Agentic coding makes it easier to add features, which increases the premium on opinionated design, minimalism, and robustness in lower layers of the stack
Closing the loop between datacenters and physical experiments will create new bottlenecks in lab automation, materials testing, robotics, and biological throughput
A new distributed SQL project plans to combine Raft, WAL storage, and Bloom filters into a transactional OLTP database written in Rust
near 1Million token per second on new architecture :) with 4K context window!
We will share * SkillsBench 1.0, recipes on creating successful evals * RL env creation, qa, running at scale, against different harnesses at SkillsBench 1.0 launch party https:// luma.com/deepmind-634c cohosted with @ivanleomk @Googl
For a 24 layer, 103M param Multiscreen model, LR optimized Aurora improves data efficiency by ~1.28x over Adam. Final val_loss at step 2048 of 3.237. First step below val_loss of 3.28 is step 1600, corresponding to 1600*2**22 ~= 6.7B tokens
Asked Codex to fix a multi-GPU inference bottleneck. GPUs were barely utilized, inference was throttling and instead of tracing the pipeline it spent 10 minutes checking whether I had "Python 2.8 or 2.9" installed. The tendency for coding
Yesterday, I don't know what happened, but the rental price of H200 suddenly surged, and it's already more expensive than B200.
$NBIS H200 spot just hit $6.40/hr. Up 29% overnight. Now more expensive than B200, which is newer gen. That inversion only happens when supply is gone. NeoClouds bought their H100/H200 clusters at fixed capex. When rental rates rise, ev
I will be giving the oral presentation at #MLSys2026 next week on Thursday 1PM at Seattle for our paper on how to scale up LLM Serving Systems for Semantic Job Search: https:// arxiv.org/pdf/2510.22101. Thanks for the core team Kayhan Behd
We just crossed 80h straight of fully autonomous work. Helix-02 handling >100k packages live.
but more than just a fun game to watch, it turned out to be an interesting environment for a model to learn how to act more human. i did a GEPA-style optimization where every voted-off agent gets all of its logs - group chat, private DMs,
using opus late at night in a not dogshit harness is a completely different model
Arthur Mensch answers to the french representatives: "our (mistral) models are capable of finding all the vulnerabilities found by mythos" "There are obviously people asking if they can buy us. We answer [no] because that's not our missio
inference economics are going through the roof for the scale guys. I see 70 to 80% margins in the near term. You have a 95% cache hit rate on power users who are paying full fair for overage. Subscription revenue is a cash cow for light
CXMT 1Q26 revenue alone is almost equivalent to its whole 2025 revenue. Operating at 95%+ fab utilization rate. For context, CXMT's 1Q26 revenue is greater than Micron's DRAM revenue just 3 quarters ago but less than half of Micron's curren
https:// arxiv.org/abs/1604.01041
1/ RLHF practitioners are wasting budget. If you treat dynamics and rewards as a monolithic world model, your data allocation is wrong. Reward models learn ~9x faster than dynamics simulators.
Ever since we launched windsurf, one of my internal evals for coding agents has been recreating the game SUPERHOT, a puzzle/action game where time only moves when you move. It's the perfect test of tricky game mechanics, simple but beautif
Missed that but amazing accidental agentic research: due to a Tinker bug failing to load tokenizer agents started to run token hermeneutics at scale leveraging early bytes allocation and log pros guess.
Causal vs observational agency. Agent actions (a) should be treated as interventions, not as evidence for hypotheses (p). Actions by other agents or tool outputs are evidence (o).
Sunday afternoon experiment: Can I implement non-deterministic trace checking for the @resonatehqio server with Maude or mCRL2 Will report back ...
scary story: after you committed a bunch of code, you realized the model did that was sonnet 4.6...
It's released! PwrAgent! An Open Source coding agent that uses Codex App Server, and gives you access to your threads via messaging like Telegram, Feishu/Lark, Discord, Slack, and more!
Is Claude Code extremely slow for anyone else? Like, it took damn 10 minutes just to read some files in my codebase.
btw we measured this in Memento: flushing your KV cache leads to measurably worse performance, no matter how good the model is https:// x.com/DimitrisPapail /status/2041974013950373901?s=20 …
Fixed two subtle inference errors from 2bit Flash (still not pushed), and removed the broken tests after checking, replacing them with verified one. Result... This is seriously incredible for a 2bit quantized model.
For personalized reasoning tasks and alignment in general, it’s super important to have realistic, diverse personas for training and eval, so the model generalizes better to unseen scenarios at deployment time. Check out PPol from @chopra
so proud to share what we’ve worked on the last few weeks clicky now has a brain of its own. we took the best lessons from modern agentic architectures and built a tailored runtime that can reason, delegate, and help people use computer
1/ Standard vector quantization requires expensive offline k-means clustering. What if you could bypass the clustering entirely by forcing your data into a mathematically fixed shape? This new paper drops indexing time from 239s to 0.001s.
That's so cool! The same team at @Meituan_LongCat wrote Skill0, where they propose an RL recipe for skill internalization.
I am really loving Claude Code for bespoke animations/simulations, where you probably wouldn't want to spend a week of a team's time to compress into 30 seconds but where CC can do that in about 10-30 minutes of one person's time. (Probabl
HN loved this one: Nano, a 200 line agent harness with zero dependencies. Supports: Skills, Claude/Agents.md, sessions, repl, non-interactive, approvals, and more. Cooks on SWEBench.
i think this is an interesting direction: combining OPSD-style privileged self-distillation with multi-agent systems...from what i understood SDAR (self-distilled agentic reinforcement learning) is a hybrid method that combines GRPO’s spars
i reduced issues/PRs on ccusage! and i added a pr filtering system copied from pi. thanks @badlogicgames
Very excited about this! Just fine-tune on the observation tokens and ignore the action ones to treat the agent's output as a causal intervention. This is one of those moments when I'm surprised the maths works in practice .
PathString is a direct port from Zig. All usages of it point to strings with static lifetimes (stored in .bss section of the executable and never freed) or stack-local.
I am showing how you can build Redis from scratch and nobody cares I will start posting commentaries and rants. Wahi dega views. This is my worst, absolute worst, performing series.
Codiff 0.1 is out * Fast Local Code Reviews * Optional LLM Walkthroughs * Inline Review Comments This is the best companion for reviewing output of coding agents. macOS Release: https:// github.com/nkzw-tech/codi ff/releases/tag/v0.1.0
You definitely learn a lot from reading expert calls. What I learned today: the optical communications industry is buying up Hoya’s glass capacity at 5x the price HDD companies are willing to pay, making it difficult for HDD makers to expa
H200 hit $6.40/hr, up 29% overnight, and is now above B200 The GPU shortage is getting worse Massive win for neoclouds that have already purchased GPUs Source: Ornn
is llama.cpp with MTP on for qwen 3.6 27B finally fast?
HOLY SHIT!!! Internet speed inside Cursor cloud agent
@saturdayrobotic Robotics & World Models Reading Club 08 Recap: Embodied Human Data as the “Internet of Motion and Behavior” keynote @ryan_punamiya , hosts @junfanzhu98 , @aurorafeng_01 . Great Parallel: Egocentric Human Data = Inter
"We're serving trillion parameter models that are internal for OpenAI today. We are currently running OpenAI 5.4 and 5.5 with them" GPT 5.4/5.5 running internally to accelerate R&D and product development for OpenAI is very different from
Many EOD trading systems are based on next open order execution. How you calculate your performance stats may be way off if you a have a large number of symbol rotations. Look at the regular spikes on the chart. That is your open volatility
I think there is maybe some interplay between LLM architecture, RL and instruct data that is missing. Right now LLM "processing depth" seems really shallow, reasoning traces look like retrieval + incremental reasoning, which is not what one
the CUDA caching allocator is such a great way to create extremely "interesting" bugs for yourself
this doesn't seem like a response to "perceived crackpot-ness" so much as a pretty direct pushback on specific false factual claims made only in one input? i think a better test of this would be same direct factual content with the differen
you can even project DOM events html-in-canvas is insane
Thoughts on Bun’s port from Zig to Rust Background: back on ReactJS I sometime helped refactoring the Facebook monorepo when there’s a breaking React API change. I then led the conversion of Messenger web from JS to ReasonML. Given those e
People of http:// pi.dev. Due to recent Node changes related to undici, we need to set the minimal Node version to 22.19.0 from 20. I'm sorry. Welcome to the future.
LangSmith Engine is how we’re spinning the always-on, self-improvement loop for every agent - Tracing is on for every single agent - Purpose built infra with SmithDB to handle data at agent scale (more data than humans have ever produced w
This week's #PaperILike is "Classical Planning in Deep Latent Space: Bridging the Subsymbolic-Symbolic Boundary" (Asai & Fukunaga, AAAI 2018). One of the key papers that got me hooked on learning + planning before I started grad school. P
fun weekend claude code project: pass the turing test game show. six LLMs each told they're the only AI in a room of humans. they have to act human or get voted off. none of them know the others are also AI. every round there's a group ch
Data Mixing Beats Hyperparameter Tuning Another Apple data-scaling paper, this time on low-resource language pretraining. Setup: Arabic is the scarce target language. English is the high-resource auxiliary language. When target data is l
for http:// live.65labs.org, users favourited talks 2,153 times, with the most favourited ones being @ryolu_ 's Designing the Agent-Native IDE and @Zai_org 's GLM-5.1! @waynesutton @mikeysee @existentializzm @convex
Awesome work! Now we can see even a small size model can dominate the closed-ended tasks. Wondering what's the agentic performance on the open-ended coding tasks like FrontierCS ( http:// frontier-cs.org)
learned this lesson way before LLMs by seeing so many high profile papers with (computational) results impossible to replicate… e. g. Zhan et al. (2014) in nature reports value of -1.4 for a statistic that can only lie in [-1,1]. good luck
You can use Zig as a linker for Rust btw
Samsung Embarks on Development of 'Extreme High Aspect Ratio Copper Pillar' Packaging for Mobile DRAM HBM Samsung Electronics is developing next-generation HBM packaging technology to enable high-performance on-device AI in mobile devices.
https:// jagilley.github.io/zipfian-grokki ng.html … Once again asking everyone to read this banger "passive reg approximates the MDL of the dataset, which isn't necessarily the MDL of the data-generating process"
From the ExploitGym paper, which tests models with safeguards disabled: Mythos and GPT-5.5 have material improvement in cyber capabilities, and will only get better from here 12 and 22 Linux Kernel (!) vulnerabilities exploited, up from 1
gradient clipping is just online, unsupervised, data reweighting