Sequencing a human genome to 30× coverage at home
Every step from saliva collection to 30× sequencing happened in a single room, showing how quickly genomics hardware is moving from labs toward hobbyist workflows
Balanced toward durable artifacts and concrete technical/current-news substance while limiting the many agent-tool launches and near-duplicate Huawei/DeepSeek threads
Every step from saliva collection to 30× sequencing happened in a single room, showing how quickly genomics hardware is moving from labs toward hobbyist workflows
A 20-bit memory cell array made with homemade sputtering and lithography tools turns semiconductor fabrication into a plausible garage-scale experiment
The same package-manager security issue got fast advisory/backports from pnpm and silent fixes from Bun, illustrating how disclosure process matters as much as the patch
APKPure shipped Telegram installation packages embedded with a spy framework that collected chats, contacts, files, location, and media, making app-store trust a direct security boundary
Aalo says all hardware for a 10 MW zero-power criticality test is complete and fuel is on site pending regulatory approval, a concrete milestone for a new reactor startup
A free 1.1 edition of High Performance Git turns scattered performance folklore around a ubiquitous tool into a durable reference
Custom homebrew apps on iPod nano 7th generation reopen an abandoned consumer device as a small, constrained developer platform
mKernel fuses compute and communication into persistent GPU kernels across intra- and inter-node systems, attacking a real bottleneck in distributed training
Biology model training needs fresh experimental data, and an automated lab running around the clock is a glimpse of how that data is produced
DeepSeek-style reductions in HBM demand shift China’s AI stack toward domestic NAND, LPDDR, ASIC, and CPU suppliers rather than US-controlled GPU chokepoints
Logic splitting and packaging, not lithography, are the key techniques behind Huawei’s 1.4 nm-equivalent roadmap claims
DeepMind’s AlphaProof Nexus solved nine open Erdős problems with an agentic formal-proof-search framework, pushing AI math from demos toward open-problem work
A homebuilt computer reaching a count-to-100 program proves the ALU, registers, branches, jumps, GPIO, SRAM, bootloader, and program counter all work together
Flow cytometry is one of biology’s core measurement techniques, and a clear explainer makes the machinery behind cell sorting and profiling legible
A synthetic miRNA feedback loop that represses its own synNotch receptor points toward programmable, self-regulating cell therapies
Amazon, Meta, Alphabet, and Oracle issued $159B in bonds this year for AI data centers, including $50B in foreign-currency reverse Yankees
Figma’s Q1 report—46% revenue growth, 139% NDR, raised guidance—counters the story that AI design tools or Adobe limbo had stalled the company
A development-economics working paper revisits the cleanest evidence for poverty traps and finds the central idea harder to pin down than the textbook story suggests
A detailed critique of the AI water-use statistic matters because infrastructure debates can be derailed by a single bad denominator
Robot-learning datasets are multi-rate and multimodal by default, and .rrd is designed around that reality rather than treating logs as ordinary videos or tables
A single low-context CCTV frame can identify the house a camera network belongs to, collapsing the privacy boundary between public screenshots and physical addresses
Schedule-free spectral optimization matching or beating heavily tuned AdamW across 125M and 772M parameter language models hints at simpler training recipes
Chat UIs stress virtualized lists in unusual ways, and first-class end anchoring, streaming, and stable prepends make that complexity boring
A rare use-after-free in Linux’s epoll subsystem is the kind of core kernel bug worth studying because the primitive sits under enormous amounts of production software
Paris’s RER became one of the world’s great regional rail systems partly through a Japan study trip, a reminder that infrastructure progress often travels through imitation
Practical demonstrations of a library-card designer and a new control make interaction design concrete instead of another gallery of static screenshots
A compact additive-combinatorics question about ternary and base-4 digit restrictions can pull mathematicians and programmers into the same problem
A deep essay on Jira as computation reframes a disliked enterprise tool as a system with state, transitions, and social semantics
LIFT is the SFT recipe for dLLMs that actually understands the masking dynamics. Vanilla SFT on dLLMs often HURTS performance, and they finally pin down why. Their analysis: vanilla SFT overlooks learnability. Rare tokens are difficult to
Reinforcement learning research with Joseph Suarez
Approximate Machine Unlearning through Manifold Representation Forgetting Guided by Self Mode Connectivity
MiniCPM5-1B is now fully open source, including weights, training data, and deployment code. 1B params, #1 on Artificial Analysis among all open models under 2B (17.9 pts). https:// modelscope.cn/models/OpenBMB /MiniCPM5-1B … Beats Qwen3
Why KV cache is one of the main reasons LLMs are fast? KV cache is what connects attention mechanism with generation stage of autoregressive models. These models generate text token by token, but each new token still attends to all previou
RL has largely been a consumer of a deep learning toolkit that was developed for supervised learning. In our recent work we explore RL specific hierarchical state representations that allow agents to overcome issues with low quality demonst
LLMs can extract recurring reasoning fragments from prior traces and compile them into a concise reusable behavior handbook, used in-context at inference or distilled into the model (Meta, Mila-Quebec AI Institute, Princeton University) h
launching AgentIR Blackbox https:// agentir.dev an llm request router for agent system Blackbox finds which llm calls are on your workflow’s critical path, sends them to faster providers, and routes less urgent calls cheaper to maintain
AgentMail now supports IMAP. Open your agent's inbox in any email client. Debug by just looking at it. You could already send. Now you can read too. Documentation in replies.
While @SpaceX was launching rockets we were using @Starlink to remotely inference our excavator robot model that we trained with 2.5 hours of operator data. We are teaching heavy machines to do real tasks on job sites by learning from
Long-horizon LLM agents accumulate conversation histories that blow past the context window. The usual fix is LLM-based summarization, which is lossy AND blocks the agent for tens of seconds while the summarizer runs. Parallel Context Comp
So software built with @zml_ai runs transparently at max speed on: - CPU - NVIDIA GPUs - AMD GPUs - Google TPUs - AWS Trainium - Intel GPUs - Tenstorrent NPUs - Apple GPUs (very experimental) And more to come.
3.8% for Claude Opus 4.7 and 0.0% for Gemini 3.1 Pro SaaS-Bench from UniPat AI just dragged Computer-Use Agent benchmark theater into the cold light. They put 23 real open-source SaaS systems into Docker with full DB state and business con
Phase 2 of my heuristic-learning ImageNet-10 experiment: Inspired by @Trinkle23897 's “Learning Beyond Gradients,” I used Claude Code + Codex to iteratively improve a pure symbolic vision system. No neural nets. No backprop. Just visual
We heard concerns that Antigravity consumes many tokens for simple tasks now. So, we're adding Gemini 3.5 Flash (Low) as a way to optimize token usage for these tasks. In our internal testing, it generates around 45% fewer tokens than Gemin
in theory it can be super fast thanks to speculative decoding, but batching with other normal requests probably slows it down
Bridge + @Tiny_Fish in action: Use one request to sign in to Hacker News, capture the session, and ask for account context like current karma. TinyFish powers part of the web-agent flow; Bridge connects it back to your desktop. #AIAgen
LLMs are trained on web data. Physical AI is trained on physical data. Physical data is different from web data. It is multi-rate: cameras might run at 10-30Hz, joint angles at 100-200Hz, and IMUs at 1kHz. It is also multimodal: one stre
OpenClaw's dependency purge continues. Killed Sharp and Jimp. Replaced it with photon, a small WebAssembly that runs compiled Rust for image processing. 2MB vs 140MB.
Introducing SkillOpt — an optimizer for agent skills. Instead of finetuning model weights, we treat a natural-language skill as a trainable external parameter. Think of it as deep learning for the frontier-model + agent era: learning rat
Today we noticed Chrome unexpectedly opening Gmail and searching through emails related to us, while Codex was shown controlling Chrome from the menu bar. After investigating for a while, we traced the behavior back to the Codex Suggestion
It's been a long journey — 6 years and 381 chips to be exact — but HUAWEI's He Tingbo explains how HUAWEI's high-end chips are now expected to feature a transistor density that is equivalent to 14 Å (1.4 nm) processes by 2031.
The @pnpmjs e2e tests now use a "pnpm registry" instead of verdaccio. In the future we'll make pnpm faster with this registry.
Grok foundation model V9-Medium (1.5T) has finished training. Evals look good. A lot of Cursor data was added in supplementary training and there is more to come. Fine-tuning is underway and reinforcement learning begins in a few days. 2 t
Accepted to TMLR, with reproducibility certification v2 of our JEPA-WM study (arXiv:2512.24497) is out, with new data-scaling experiments, a Lipschitz analysis of multistep rollout training, and extended discussions. Recap + what's new
Shocking! miHoYo Employee Burns Through 2 Million Yuan Playing with AI in One Night A colleague set up dozens of agents over a weekend and forgot to shut them down, only to discover that they had consumed 2 million RMB in tokens overnight
Excited to share that Srivatsa Katta, CTO of Rapido, is keynoting Day 1 of #KubeCon + #CloudNativeCon India (18-19 June, Mumbai) Rapido powers 4M+ rides daily across 150+ microservices at 200K req/sec and it runs on the CNCF stack. Srivats
making a database index and querying it N times doesn't take N^2 complexity: it takes N + NlogN, which is what scaled dot product attention should take for a billion token context window that it's a weighted sum, rather than a lookup, is a
nice write up from the HuggingFace folks aggregating works on defining agents, harnesses, environments, RL, etc. The more we can roughly have a shared vocabulary the better…I still find it confusing (lolll), but we’re roughly converging on
deepseek just permanently priced their frontier model at 1/30th of american labs anyone know what the hardware story here is? is this huawei chips driving lower costs or model optimizations or lower margins?
introducing pierrejo for all the brotherin struggling to migrate off of github an open-source fully-functional pierre diff integration for PRs in forgejo consumable via nix flake instantaneous diff loading using pierre SSR enjoy
HyperParallel-MoE is an Ascend-specific scheduling system for MoE training. Ascend A3 exposes separate AIC matrix units and AIV vector/communication units, but standard MoE execution still runs Dispatch, GMM, SwiGLU, and Combine as seriali
DR Tulu is now accepted for an oral presentation at #ICML2026 Updated paper: https:// arxiv.org/abs/2511.19399 We added more ablations including using Qwen3-8B as the rubric generator&judge, showing evolving rubrics work with a weak mod
in traces v0.6.0 you can - search your local sessions across agents - launch any of the sessions you see directly in the agent cli (currently mac only) - view sessions from other people in the orgs you're in, directly from the terminal I
Updated my MLX Vulkan CI to record and show benchmark result on every commit to main!
This is really really bad, found another severe vulnerability in CBSE's OSM portal. Just sent another report to CERT-In.
So i've been thinking about an LVR-aware analyst AI agent to trade in some base AMM pool. There is a fairly new paper i would like to try and implement. The most common pattern for on-chain trading agents is LLM + a price feed + a swap ro
Lots of Blackwell specific PTX techniques to find here for the interested :)
Finally advisories for vulns that I found in dwmcore.dll were fixed (CVE-2026-34336, CVE-2026-35419). However, for CVE-2026-34336 list of CWE is not accurate cause heap-based overflow is possible due to integer overflow and integer overflow
"use workflow" handles this http:// workflow-sdk.dev
Reproduced HRM-Text XL (1B). Training completed in ~38 hours wall-clock on 16 H200 GPUs, and evaluation performance matches the numbers reported in the paper. Great job, team! W&B report: https:// api.wandb.ai/links/MDEQGA/7 0ciyctr …
Awesome job @jesse_merhi sharing, writing and cranking... while helping secure us all > How I Accidentally Ended Up Helping Secure OpenClaw https:// jmerhi.mov/blog/dangerous -crustacean/ … and https:// openclaw.ai/blog/where-ope n
this weekend's obsession, the amazing `cmux` by @lawrencecchen It's the perfect agentic engineering Integrated Development Environment! My set up: - Each Workspace in the left navbar is a project folder - Each workspace is split into at
I am working on Online RL. MolmoACT2 deploys out of the box but fails at many tasks due to its data distribution. One really interesting insight would be to have a setup working where you can deploy this good enough base policy and make it
introducing pipi, the shitty robot. brain lives on my laptop, sensors/UI live on the mounted phone. time to completion: 24h (minus sleep, knight festival, lunch, dinner, and play) built with http:// pi.dev
1/ Weak LLMs generate correct solutions in their latent space all the time—they just fail to select them. A new paper proves that wrapping a nano-sized model in a structured critic-comparator harness matches frontier giants on SWE-bench.
The architecture under it is genius Works with every cli, OS, one or multiple "agents" seamlessly which was also the point and how I like to build things The more I'm trying the new version (released 30m ago) the more fun it is I unified
why do LLMs generate long duplicate strings of text so slowly? consider an LLM outputting 100x's, if you ask it to do that 10 times it's linear amount of time to general, when instead the LLM could just reference that symbol 10 times
Really amazing results analyzing what's creative/novel vs. what's copied from Internet data, enabled by the amazing @liujc1998 's Infini-gram! http:// infini-gram.io This is also enabled in @allen_ai 's OlmoTrace http:// allenai.org/b
GenRecon: Bridging Generative Priors for Multi-View 3D Scene Reconstruction Reconstructing high-fidelity 3D scenes from sparse RGB input is hard. It needs a strong 3D prior! We reformulate multi-view scene reconstruction as conditional 3D
EVE-Agent argues self-evolving search agents should not train on examples they cannot justify. Data-free self-evolving search agents generate their own questions, answer them, and improve from their own feedback. That scales beautifully wit
Agents work well with static types, schemas, state machines, etc. Keep deterministic what should be deterministic. Stop trying to make everything non-deterministic.
This incident is unrelated to Squid’s core protocol and contracts. All Squid users and integrators are unaffected and no action is needed. A third-party Gnosis Safe module was exploited today across Base and Ethereum, resulting in approxim
Forged parts beat cast parts on strength for one reason: grain flow. Casting solidifies from liquid → grain is random. Forging deforms solid metal → grain follows the shape. In a crankshaft, the grain runs exactly where the stress is. T
New project: a coding and formal verification agent for computational physics and applied mathematics. Auto-generate type-correct DSL code for equations and numerical schemes, autoformalize correctness properties in Lean/Isabelle/Rocq, then
This graph from the NLA paper, imo, provides pretty convincing evidence that activation verbalizers surfaces unverbalized eval awareness. It is also crazy. Notice how the verbalized eval awareness dot is offset only when it's significantly
Building some extremely cool stuff on top of Cloudflare Dynamic Workers + Sandboxes + Artifacts. Release with blog post soon.
We just released code and model! Go check it out! Code: https:// github.com/nv-dvl/vgg-ttt Model: https:// huggingface.co/nvidia/vgg-ttt
audio reactivity tests with ASCII fluid sim. going to take a lot of work to tune emitters and parameters so it actually looks good.
Agora is about an order of magnitude faster than the system that powered Node-0. 175k tok/s is fast.
Today, I'm working on trying to better understand how coding assistants mention HF's products. Taking a simple approach of running tons of queries and analyzing the answers with @DAKlingbeil 's https:// submarine.ai (ex: https:// huggi
Crazy how quickly my workflow transitioned from local + cursor + claude code To exe dev + tmux + zed + codex Devtools have no lockin