Backlist — 02 Jun 2026 UTC

1.

Trivy scanner credentials compromised in npm supply-chain worm

A scanner trusted in CI/CD became a worm delivery path across 60+ packages within 24 hours, showing how defensive tooling can become a privileged supply-chain target

by @DFIR_Radar (DFIR Radar) · backlist 2026-06-02 · rubric 92.0

2.

Turbopuffer adds constant-time copy-on-write branching (t.co)

Turbopuffer added instant copy-on-write namespace clones with 440ms p50 branch creation, making database-scale branching feel like a filesystem primitive

by @turbopuffer · backlist 2026-06-02 · rubric 79.0

3.

HBM4E die size points to a memory supply bottleneck through 2030

A 32Gb HBM4E die appears to be around 145–150mm², implying each memory generation consumes more wafer capacity and keeps supply tight through the decade

by @racetrack275 (finánce) · backlist 2026-06-02 · rubric 74.0

4.

Mercor says it is at $1.5B run-rate revenue (x.com)

A $1.5B run rate at 35–40% gross margins reframes Mercor from a staffing marketplace into one of the fastest-growing AI data businesses

by @tanayj (Tanay Jaipuria) · backlist 2026-06-02 · rubric 22.0

5.

Critical remotely triggerable Linux kernel H.323 parser bug

A remotely triggerable H.323 connection-tracking parser bug in the Linux kernel can read out of bounds from unauthenticated packets, keeping legacy protocol parsers on the front line

by @GuanniQu (Jenny Qu) · backlist 2026-06-02 · rubric 92.0

6.

Chaos testing found and fixed a real OpenRaft bug (t.co)

A chaos test with network delay made a node panic during snapshot catch-up and stall two Raft groups, with an upstream fix shipped the same day

by @tonboio (tonbo) · backlist 2026-06-02 · rubric 88.0

7.

N-of-1 AAV gene therapy stops WWOX epilepsy seizures in an infant

An eight-month-old with WWOX epilepsy became seizure-free after an intrathecal AAV delivered a working gene copy, extending n-of-1 medicine beyond ASOs

by @alexkesin (Alex Kesin) · backlist 2026-06-02 · rubric 12.0

8.

Citadel plans to buy trading insights from other hedge funds

Citadel’s program would formalize a market for external hedge-fund alpha that feeds back into its own quantitative strategies

by @nishantkumar07 (Nishant Kumar) · backlist 2026-06-02 · rubric 55.0

9.

A scaling law for context management in deep research agents

A 32K-trajectory study across model sizes, retrievers, and benchmarks claims context compression decisions in deep research agents can be predicted rather than tuned by heuristics

by @zhuofengli96475 (Zhuofeng Li) · backlist 2026-06-02 · rubric 89.0

10.

Robotics vision encoder trained for dynamics improves OOD success

Training vision encoders to see dynamics instead of static frames improves real-world robotics out-of-distribution success by 22.5%

by @jbhuang0604 (Jia-Bin Huang) · backlist 2026-06-02 · rubric 46.0

11.

SRAM scaling is hitting Cerebras’ wafer-scale roadmap

Cerebras’ wafer SRAM jumped 2.2x from WSE-1 to WSE-2 but only 10% on WSE-3, illustrating how SRAM scaling is becoming a hard limit for wafer-scale AI chips

by @SemiAnalysis_ (SemiAnalysis) · backlist 2026-06-02 · rubric 72.0

12.

Designing a PD-1 binder with ESM and Modal for about $1,000 (x.com)

A notebook workflow can design PD-1 minibinders and scale a 96-well plate for roughly $1,000 with success rates above 50% on typical targets

by @ebetica (Zeming Lin) · backlist 2026-06-02 · rubric 22.0

13.

Automatic transmissions are hydraulic computers

An automatic transmission valve body selects gears from oil-pressure signals without electronics, running a mechanical control system on fluid logic

by @JacklouisP (Jack ) · backlist 2026-06-02 · rubric 12.0

14.

Can LLMs predict GPU kernel runtimes?

A 12K-kernel benchmark suggests LLMs can act as selective runtime surrogates during kernel search and defer to real GPUs when uncertain

by @jmin__cho (Jaemin Cho) · backlist 2026-06-02 · rubric 84.0

15.

Wuji MJLab: open-source MuJoCo environment for dexterous hands

An open MuJoCo dexterous-hand environment with cube reorientation, sim-to-real tooling, and reproducible setup lowers the barrier to hand-manipulation research

by @guanqi_he (Guanqi He) · backlist 2026-06-02 · rubric 79.0

16.

Why software rewrites fail

Implementations encode years of edge cases, operational lessons, failed experiments, and invariants, which is why clean rewrites often discard the real system they are trying to replace

by @chadfowler (chad fowler) · backlist 2026-06-02 · rubric 30.0

17.

EU digital identity wallets require Apple or Google accounts

Europe’s sovereign digital identity wallet depending on Apple or Google accounts exposes a gap between digital sovereignty goals and mobile platform reality

by @IntCyberDigest (International Cyber Digest) · backlist 2026-06-02 · rubric 28.0

18.

GoDaddy malware campaign hid C2 instructions in Steam comments

A malware campaign hid command-and-control instructions in Steam profile comments using invisible Unicode, abusing trusted gaming infrastructure for WordPress infections

by @DFIR_Radar (DFIR Radar) · backlist 2026-06-02 · rubric 88.0

19.

Wall Attention: RoPE-free attention with length generalization

Diagonal forget gates offer a RoPE-free attention mechanism that beats RoPE and Forgetting Attention on length generalization after 4K-sequence training

by @tilderesearch (Tilde) · backlist 2026-06-02 · rubric 57.0

20.

Most tokens spend 70% of their lives below launch price

Across 540+ token launches since 2020, the average token spent 70% of its life below launch price, quantifying how low-float high-FDV launches transfer value away from later buyers

by @Delphi_Digital (Delphi Digital) · backlist 2026-06-02 · rubric 24.0

21.

NeurIPS position paper track cracks down on AI-generated submissions (t.co)

The NeurIPS position paper track found heavy AI use despite a human-written requirement, creating a test case for academic review norms under ubiquitous writing tools

by @NeurIPSConf (NeurIPS Conference) · backlist 2026-06-02 · rubric 5.0

22.

Solar is cheap; deployment is now the bottleneck

Solar panels are no longer the main cost bottleneck; deployment, storage, transmission, grid flexibility, permitting, and AI-driven demand now dominate the problem

by @schrep (Mike Schroepfer) · backlist 2026-06-02 · rubric 72.0

23.

Three Mile Island restart waiver tied to Microsoft data centers

A waiver to accelerate Three Mile Island’s restart for Microsoft data centers ties AI infrastructure growth directly to nuclear regulatory decisions

by @Polymarket · backlist 2026-06-02 · rubric 24.0

24.

Cross-node KV-cache reuse for agentic rollouts

Cross-node prefix-cache reuse in vLLM via Mooncake Store makes agent rollouts cheaper by sharing reused context across distributed training nodes

by @m_sirovatka (Matej Sirovatka) · backlist 2026-06-02 · rubric 73.0

25.

CSS field-sizing: content for auto-growing textareas (x.com)

The field-sizing: content property lets textareas grow naturally without JavaScript autosize hacks

by @AliGrids (Ali Grids) · backlist 2026-06-02 · rubric 34.0

26.

How neuronal proteins drain from the brain (x.com)

Tracking neuronal proteins from creation to clearance shows waste drains to nearby immune niches and that Alzheimer’s disease blocks the route

by @andcyang (Andrew C. Yang) · backlist 2026-06-02 · rubric 12.0

27.

0xPPL shuts down after the long tail of crypto apps failed to arrive

0xPPL shut down after betting on a long tail of crypto apps that consolidated instead into a few super-app categories

by @myprasanna (Prasanna S) · backlist 2026-06-02 · rubric 6.0

28.

China is sinking a submarine data center off Hainan

China’s 1,300-ton undersea data center off Hainan aims to cut cooling costs by using seawater, raising new questions about data-center siting and physical risk

by @5chmatme (猫報道) · backlist 2026-06-02 · rubric 31.0

29.

JEE Advanced 2026 candidate/result infrastructure ( (t.co)

JEE Advanced 2026 candidate/result infrastructure ( https:// cdata.jeeadv.ac.in/result2026/) had a public cloud storage misconfiguration exposing bulk candidate data without auth. This exposed ~179.6k result records and ~187.3k admit-card

by @DarthKermi72747 (Rylen Anil) · backlist 2026-06-02 · rubric 96.0

30.

In 2016 I worked on a 1 million container challenge while working on Nomad at Hashicorp. We spun up a million con…

In 2016 I worked on a 1 million container challenge while working on Nomad at Hashicorp. We spun up a million containers in 5-7 minutes and keep them running for an ~hour and show our users Nomad can run large clusters at steady state. 10

by @diptanu (Diptanu Choudhury) · backlist 2026-06-02 · rubric 88.0

31.

We found a debug flag enabled in 6 Microsoft Android apps that turned into a vulnerability

We found a debug flag enabled in 6 Microsoft Android apps that turned into a vulnerability Any app on the device could access the Microsoft account Affecting: Word, OneNote, PowerPoint, Excel, 365 Copilot, Loop. Here's the full story of

by @Yanir_ (Yanir Tsarimi) · backlist 2026-06-02 · rubric 86.0

32.

The writeup is here. We achieved RCE in Minecraft Bedrock, turning a 4-byte heap overflow into complete client co… (x.com)

The writeup is here. We achieved RCE in Minecraft Bedrock, turning a 4-byte heap overflow into complete client compromise. @ryaagard details a universal, Bedrock-specific technique for bypassing ASLR and achieving arbitrary read / write

by @osec_io (OtterSec) · backlist 2026-06-02 · rubric 86.0

33.

Finally, we can reverse proxy Cursor's own Composer 2.5 to make it available for any agent to use. (t.co)

Finally, we can reverse proxy Cursor's own Composer 2.5 to make it available for any agent to use. https:// api-for-cursor.standardagents.ai

by @dingyi (Ding) · backlist 2026-06-02 · rubric 86.0

34.

Can an LLM act as a selective model of a GPU during evolutionary search, by reasoning + forecasting a kernel’s ru…

Can an LLM act as a selective model of a GPU during evolutionary search, by reasoning + forecasting a kernel’s runtime but deferring to a GPU when unsure? We produced 12k kernels + runtimes from evolutionary search, costing 400M reasoning t

by @codezakh (Zaid Khan) · backlist 2026-06-02 · rubric 84.0

35.

Some of the more puzzling unpublished observations from our paper: deep attention layers hate the residual stream…

Some of the more puzzling unpublished observations from our paper: deep attention layers hate the residual stream of V and love it for QK, but if it has to make a choice, it will satisfy V over QK. Translated to finding: if we learn coeffi

by @HeMuyu0327 (Muyu He) · backlist 2026-06-02 · rubric 84.0

36.

native copy-on-write branching in tpuf is an enormous ship. we already have 10M+ branches in production.

native copy-on-write branching in tpuf is an enormous ship. we already have 10M+ branches in production. you can go try it right now

by @Sirupsen (Simon Eskildsen) · backlist 2026-06-02 · rubric 83.0

37.

Grok Voice from xAI controlling my macOS desktop!

Grok Voice from xAI controlling my macOS desktop! Built a custom app with 400+ tools. Handles music, browser, settings, lightning fast responses. It's able to spin up Grok Build agents and manage them. Grinding on this, still testing.

by @Daniel_Farinax (Dan) · backlist 2026-06-02 · rubric 82.0

38.

So /goal is awesome (x.com)

So /goal is awesome Over the past few weeks I used @PrimeIntellect to train a 149M late interaction model based on GTE-ModernColBERT-v1 using PyLate, focused on clause extraction from legal contracts. On the MLEB benchmark it does well

by @kmad (Kevin Madura) · backlist 2026-06-02 · rubric 82.0

39.

SDPO++ for Continual Learning

SDPO++ for Continual Learning Day 5 of Trajectory, we modify Self Distillation Policy Optimization for long horizon agentic tasks. SDPO is a promising route. It learns from a single trajectory, with no group required and failures still p

by @rronak_ (Ronak Malde) · backlist 2026-06-02 · rubric 79.0

40.

GRPO has a known dead-zone: when all sampled trajectories are all correct or all wrong, group-relative advantage …

GRPO has a known dead-zone: when all sampled trajectories are all correct or all wrong, group-relative advantage collapses and learning stalls. On-Policy Self-Distillation tried to give dense token-level guidance but its token preferences

by @sheriyuo (Xiuyu Li) · backlist 2026-06-02 · rubric 79.0

41.

Cerebras did what the industry calls impossible: turned an entire 46,225mm² wafer into one chip. Defects on silic…

Cerebras did what the industry calls impossible: turned an entire 46,225mm² wafer into one chip. Defects on silicon that big are inevitable, so they built in redundancy and custom per-batch masks that route around every bad core, landing ne

by @SemiAnalysis_ (SemiAnalysis) · backlist 2026-06-02 · rubric 78.0

42.

user reported a bug that was happening for them but not for me

user reported a bug that was happening for them but not for me i whipped up a prompt for them to run in codex a few mins later: fix confirmed, user opened PR self-healing software is here, just not evenly distributed

by @KishanBagaria (Kishan Bagaria) · backlist 2026-06-02 · rubric 78.0

43.

Verification is the hidden bottleneck for knowledge work agents, especially in legal AI — complex, long-horizon w… (x.com)

Verification is the hidden bottleneck for knowledge work agents, especially in legal AI — complex, long-horizon work is graded by rubrics with dozens of strict criteria. In new research with @langchain Labs, we study how to verify legal

by @nikogrupen (Niko) · backlist 2026-06-02 · rubric 78.0

44.

Today i'm excited to announce storagesdk, built in partnership with (x.com)

Today i'm excited to announce storagesdk, built in partnership with @TigrisData A TypeScript SDK for object storage with snapshots and forks as first-class primitives branch a bucket per agent run. mutate safely. replay from the same ba

by @computesdk (ComputeSDK) · backlist 2026-06-02 · rubric 78.0

45.

We now have upwards of 500,000 CPUs across our servers.

We now have upwards of 500,000 CPUs across our servers. I went to order a block of 100k more. Only one vendor was able to fulfill that order; no other supplier had the scale required for our timeframe. The CPU crisis isn't here yet for mo

by @ivanburazin (Ivan Burazin) · backlist 2026-06-02 · rubric 78.0

46.

One policy. Many robots. Many dexterous hands.

One policy. Many robots. Many dexterous hands. XL-VLA learns a shared cross-hand latent space for VLA models across XHand, Inspire, Ability, and Paxini. It retargets trajectories across hands, scales with humanoid and tabletop robot data,

by @YutongLiang_ (Yutong Liang) · backlist 2026-06-02 · rubric 78.0

47.

Attackers hijacked Red Hat's legitimate npm scope to push backdoored versions of 32 packages targeting cloud secr…

Attackers hijacked Red Hat's legitimate npm scope to push backdoored versions of 32 packages targeting cloud secrets and CI/CD tokens. The malware spread via compromised GitHub Actions OIDC tokens, affecting 9.8M downloads. #DFIR_Radar

by @DFIR_Radar (DFIR Radar) · backlist 2026-06-02 · rubric 78.0

48.

Miasma malware compromises Red Hat npm packages in sophisticated supply chain attack, stealing credentials and sp…

Miasma malware compromises Red Hat npm packages in sophisticated supply chain attack, stealing credentials and spreading through CI/CD pipelines using worm-like behavior. Builds on Shai-Hulud tactics with GitHub abuse for verified malicious

by @DFIR_Radar (DFIR Radar) · backlist 2026-06-02 · rubric 78.0

49.

Most "agent memory" benchmarks just test whether a chatbot remembers your preferences. That tells you almost noth…

Most "agent memory" benchmarks just test whether a chatbot remembers your preferences. That tells you almost nothing about real agents. So we built MemGym: memory evaluation for deep research, coding, and GUI agents, with a clean memory-iso

by @wujiang_ai (Wujiang Xu) · backlist 2026-06-02 · rubric 78.0

50.

I have successfully evaluated OPUS on models exceeding 1 trillion parameters with strong results. However, due to…

I have successfully evaluated OPUS on models exceeding 1 trillion parameters with strong results. However, due to some reasons, we are unable to include these findings in the current manuscript. Additionally, I have observed that OPUS demo

by @ShaoboWang6 (Shaobo Wang) · backlist 2026-06-02 · rubric 78.0

51.

Excited to be presenting our work on memory + VLAs at ICRA'26 this Thursday morning (poster 224).

Excited to be presenting our work on memory + VLAs at ICRA'26 this Thursday morning (poster 224). We found that a super simple language-based scratchpad with spatial and temporal grounding goes a long way in imparting memory to VLAs. 1/n

by @SanjayHaresh (Sanjay Haresh) · backlist 2026-06-02 · rubric 76.0

52.

currently at happy hour and learning the rodent company just straight up took some of our code, violated the lice…

currently at happy hour and learning the rodent company just straight up took some of our code, violated the license, still violating the license how did we know? they left links to our issue tracker in it 1-10 how aggro should i be? how

by @zeeg (David Cramer) · backlist 2026-06-02 · rubric 74.0

53.

In the past months, I have advocated where I could to stop investing in obfuscation as a protection mechanism.

In the past months, I have advocated where I could to stop investing in obfuscation as a protection mechanism. It was good at frustrating humans but machines just don't care that much. Packing is more effective so far but I don't expect thi

by @JusticeRage (Ivan Kwiatkowski) · backlist 2026-06-02 · rubric 74.0

54.

As of today, there’s essentially 5 companies who have successfully completed meaningful SOTA moving decentralized… (x.com)

As of today, there’s essentially 5 companies who have successfully completed meaningful SOTA moving decentralized *pretraining* runs: - @PrimeIntellect (10B INTELLECT-1, Oct 24) - @Pluralis (7.5B Node0, Oct 25; 8B Agora launch, May 26)

by @jbrukh (Jake Brukhman) · backlist 2026-06-02 · rubric 74.0

55.

well, one way to be fast is to reduce the active count. another is to address the attention architecture. the spa…

well, one way to be fast is to reduce the active count. another is to address the attention architecture. the sparsity graph doesn’t quite address this angle with param count x axis

by @fujikanaeda (Eric W. Tramel) · backlist 2026-06-02 · rubric 74.0

56.

best writeup on the topic. most harnesses out there are built with no abstractions in mind, software engineering …

best writeup on the topic. most harnesses out there are built with no abstractions in mind, software engineering replaced by giant brittle prompts and the occasional summarization. you cannot test them, you can barely reason about them and

by @dosco (spacy) · backlist 2026-06-02 · rubric 74.0

57.

MiMo-V2.5-Pro now available on Zyphra Cloud! (x.com)

MiMo-V2.5-Pro now available on Zyphra Cloud! Huge context, super fast, optimized and served on @AMD MI355X. Full context at $1/M input, $3/M output, $0.2/M cached. Try now at http:// cloud.zyphra.com

by @ZyphraAI (Zyphra) · backlist 2026-06-02 · rubric 72.0

58.

Modded-NanoGPT optimization result #29 (2026/05/11): (x.com)

Modded-NanoGPT optimization result #29 (2026/05/11): @nilinabra has achieved a new step-count record of 2990 (40-step improvement) by halving the growth rate of the L2-norm of the hidden matrix parameters. This result is better than the

by @kellerjordan0 (Keller Jordan) · backlist 2026-06-02 · rubric 72.0

59.

ProgramAsWeights can now trade compile time for accuracy. (t.co)

ProgramAsWeights can now trade compile time for accuracy. New Finetune compiler synthesizes 3.6K examples & finetunes 100 steps in 1min — 80% acc on tasks where old PAW got 0%! I used it to build an "Ask me anything" website helper: htt

by @yuntiandeng (Yuntian Deng) · backlist 2026-06-02 · rubric 72.0

60.

Marvell CEO says copper wall is moving inside the rack, and copackaged optics is the only way through

Marvell CEO says copper wall is moving inside the rack, and copackaged optics is the only way through • Marvell CEO Matt Murphy emphasized at Computex 2026 that the next bottleneck in AI infrastructure is not compute or memory but connect

by @jukan05 (Jukan @COMPUTEX) · backlist 2026-06-02 · rubric 72.0

61.

CVE-2026-8732 (CVSS 9.8) in WP Maps Pro plugin lets unauthenticated attackers create WordPress admin accounts via…

CVE-2026-8732 (CVSS 9.8) in WP Maps Pro plugin lets unauthenticated attackers create WordPress admin accounts via flawed "temp access" feature. 2,858 attacks blocked in 24 hours across 15,000+ vulnerable sites. Update to v6.1. #DFIR_Radar

by @DFIR_Radar (DFIR Radar) · backlist 2026-06-02 · rubric 72.0

62.

I was just scammed for $500K by Polymarket.

I was just scammed for $500K by Polymarket. I am "willo2", the top holder of YES on "MicroStrategy sells Bitcoin by May 31st". Here's what happened:

by @willo2_Poly (willo2) · backlist 2026-06-02 · rubric 72.0

63.

interesting... the numbers on my a100 gemms are about 2-6% higher consistently with the new mma_throughput pragma…

interesting... the numbers on my a100 gemms are about 2-6% higher consistently with the new mma_throughput pragma vs when not

by @aryanvs_ (Aryan V S) · backlist 2026-06-02 · rubric 72.0

64.

Serving LLMs is expensive because decoding is bound by memory bandwidth, not raw compute.

Serving LLMs is expensive because decoding is bound by memory bandwidth, not raw compute. KV caching solves this by storing each token's K/V tensors once and reusing them at every step, so you skip the quadratic recompute. Pairing it with p

by @paddix (Paddy Srinivasan) · backlist 2026-06-02 · rubric 71.0

65.

Just imagine:

Just imagine: you create a model (MinMax M.27) that scores the SAME results as Opus 4.6 on SWE Bench PRO. But when we create a benchmark where your model didn't train, you literally score 0. Because MinMax models are shit, and incomparabl

by @melvynx (Melvyn • Builder) · backlist 2026-06-02 · rubric 71.0

66.

RL these days generally requires three things: inference, training, and sandboxes. I wonder what AI infrastructur…

RL these days generally requires three things: inference, training, and sandboxes. I wonder what AI infrastructure provider has all three?

by @bernhardsson (Erik Bernhardsson) · backlist 2026-06-02 · rubric 71.0

67.

XBOW is harnessing the power of AI to transform offensive security. Curious how autonomous offensive security is …

XBOW is harnessing the power of AI to transform offensive security. Curious how autonomous offensive security is changing the game? It’s day two of the Gartner Security & Risk Management Summit, and we’re ready to talk all things autonomou

by @Xbow (XBOW) · backlist 2026-06-02 · rubric 70.0

68.

Visual encoding is one of those bottlenecks that gets ignored until you’re running image-heavy workloads at scale…

Visual encoding is one of those bottlenecks that gets ignored until you’re running image-heavy workloads at scale. Offloading to CPU at near-zero added cost and getting 1.3-30x lower TPOT is not obvious — most people assume CPU offloading h

by @richardczl (Richard Chen) · backlist 2026-06-02 · rubric 70.0

69.

Despite rapid progress in AI agent research, Korean agentic benchmarks remain largely absent! (t.co)

Despite rapid progress in AI agent research, Korean agentic benchmarks remain largely absent! To narrow this gap, we release K-BrowseComp, a benchmark that requires searching across Korean websites and Korean-language content. https:// a

by @seungonekim (Seungone Kim) · backlist 2026-06-02 · rubric 70.0

70.

q: "why don't Sora-like models learn compositional physics understanding or do ICL like how language models learn…

q: "why don't Sora-like models learn compositional physics understanding or do ICL like how language models learn compositional semantics?" a: every attempt to date heavily leaks information from the future. some even bake it into the bottl

by @kalomaze · backlist 2026-06-02 · rubric 69.0

71.

Custom KQL rule in Microsoft Sentinel successfully caught 260 SSH brute force attempts across 3 attack waves in 2…

Custom KQL rule in Microsoft Sentinel successfully caught 260 SSH brute force attempts across 3 attack waves in 28 minutes. Rate-based detection prevented alert fatigue while maintaining 100% detection accuracy. Technical breakdown: • Rule

by @DFIR_Radar (DFIR Radar) · backlist 2026-06-02 · rubric 69.0

72.

On coding, we see a big improvement on Vibe Code Bench, where the model scores 47.6%, a massive 35 point increase…

On coding, we see a big improvement on Vibe Code Bench, where the model scores 47.6%, a massive 35 point increase from the previous version and outperforming Gemini 3.1 pro, a frontier model. On SWE-Bench (75%) and Terminal-Bench-2.1 (54%)

by @ValsAI (Vals AI) · backlist 2026-06-02 · rubric 68.0

73.

Running a causal_policy mode experiment with Cosmos 3 with Fast-WAM style predictions, jointly training:

Running a *causal_policy* mode experiment with Cosmos 3 with Fast-WAM style predictions, jointly training: - FD (action -> video) - ID (video -> action) - policy (past_video -> action+video) - *causal_policy* (past_video -> action)

by @neurosp1ke (Andreas Köpf) · backlist 2026-06-02 · rubric 68.0

74.

Congrats to (x.com)

Congrats to @JetBrains on Mellum2-12B-A2.5B-Thinking, an open-source 12B MoE that activates just 2.5B params, handling both natural language and code with a 128K context. Mellum2 runs natively in vLLM from day 0, with reasoning parser a

by @vllm_project (vLLM) · backlist 2026-06-02 · rubric 68.0

75.

nemotron 3 is significantly less sparse than other models (~10% active vs ~3% for kimi K2/deepseek v4)

by @eliebakouch (elie) · backlist 2026-06-02 · rubric 68.0

76.

so (x.com)

so @morgallant has optimized FTS tokenization throughput to 423 MiB/s and open-sourced it ( https:// github.com/turbopuffer/al yze …) I keep telling him that it would be really high agency to get to DRAM bandwidth (~100 GiB/s), and he ke

by @Sirupsen (Simon Eskildsen) · backlist 2026-06-02 · rubric 68.0

77.

Biggest difference between Codex vs Claude Code: Codex won't stop working towards a /goal. CC quits on hard tasks…

Biggest difference between Codex vs Claude Code: Codex won't stop working towards a /goal. CC quits on hard tasks and needs encouragement to keep going. So weird. BUT Codex is pretty bad at intermediate summaries. Its outputs read like eso

by @DimitrisPapail (Dimitris Papailiopoulos) · backlist 2026-06-02 · rubric 67.0

78.

hint variations are quite good at dropping KL shock while keeping the rough tokens touched consistent. (x.com)

hint variations are quite good at dropping KL shock while keeping the rough tokens touched consistent. OPSD (custom variant): high exposure hint OPSD (custom variant): lower exposure hint OPD (yellow is negative KL, purple is positive).

by @ar0cket1 · backlist 2026-06-02 · rubric 67.0

79.

It is often understated how hard ot is to compare multilingual models, especially small ones.

It is often understated how hard ot is to compare multilingual models, especially small ones. With hundreds of people working on it, you can now enjoy:

by @LChoshen (Leshem (Legend) Choshen ) · backlist 2026-06-02 · rubric 66.0

80.

Audio Dataset Cleaning: All that glisters is not Gold

Audio Dataset Cleaning: All that glisters is not Gold -- Said "high quality" audio datasets are not always high quality, which can leave you puzzled when training doesn't make your transcription or TTS model any better. The most robust app

by @TrelisResearch (Trelis Research) · backlist 2026-06-02 · rubric 66.0

81.

Today, we're introducing LightOn Console.

Today, we're introducing LightOn Console. Three endpoints: /Parse any documents /Extract structured data /Search enterprise knowledge with citations Built-in connectors. MCP-ready. Governance enforced at the chunk level. No infrastruct

by @LightOnIO (LightOn) · backlist 2026-06-02 · rubric 66.0

82.

every couple weeks I bump up the disk GB on Replicas agents and go “yup, that’s more than enough”

every couple weeks I bump up the disk GB on Replicas agents and go “yup, that’s more than enough” then some customer pings me asking for 2x more? so I gotta ask our sandboxing provider for more limits, and their poor engineer has to explai

by @connortbot (Connor Loi) · backlist 2026-06-02 · rubric 66.0

83.

Hackers exploited Meta’s AI support bot to reset Instagram passwords, briefly defacing high-profile accounts, whi…

Hackers exploited Meta’s AI support bot to reset Instagram passwords, briefly defacing high-profile accounts, while Meta pushed an emergency patch and experts warned about AI-assisted account recovery risks.

by @Cyber_O51NT (Cyber_OSINT) · backlist 2026-06-02 · rubric 66.0

84.

Cloudflare Sandboxes Cloudflare Tunnel

Cloudflare Sandboxes Cloudflare Tunnel You can now expose a service running inside a Cloudflare Sandbox using a Cloudflare Tunnel Both quick tunnels and named tunnels are supported

by @_ashleypeacock (Ashley Peacock) · backlist 2026-06-02 · rubric 64.0

85.

the agent at the top of the long mem eval benchmark leaderboard is built on langgraph!

the agent at the top of the long mem eval benchmark leaderboard is built on langgraph! if you're thinking about building a high performance memory system for your agent, you should read this

by @sydneyrunkle (Sydney Runkle) · backlist 2026-06-02 · rubric 64.0

86.

BlueCyber analyses a January 2026 Mustang Panda PlugX sample delivered through a three-file set dropped by an MSI… (t.co)

BlueCyber analyses a January 2026 Mustang Panda PlugX sample delivered through a three-file set dropped by an MSI chain. The sample uses DLL sideloading, staged decryption, manual in-memory loading, and a final WinHTTP connection into the c

by @virusbtn (Virus Bulletin) · backlist 2026-06-02 · rubric 64.0

87.

Most observability for agents is the wrong shape.

Most observability for agents is the wrong shape. You install a debugger. You connect it to your harness. You hope the spans the framework chose to emit cover the failure mode you're chasing. When they don't, you patch instrumentation into

by @mfpiccolo (Mike Piccolo) · backlist 2026-06-02 · rubric 63.0

88.

i built an implementation of google's map-reduce paper.

i built an implementation of google's map-reduce paper. > master/worker architecture over tcp rpc > byte-range input splitting > reducers fetch intermediate partitions over rpc > worker health checks + automatic task re-queueing > atomi

by @numaan27 (numaan) · backlist 2026-06-02 · rubric 63.0

89.

I got 2 intel bad boys on Friday.

I got 2 intel bad boys on Friday. At this point I’m struggling to find more power for all this. - 4x 6000s - 1x DGX Spark - AMD Strix - 4x 3090 - 2x intel arc b70 - Mac mini 16gb - MacBook Pro 32gb 544gb VRAM 300gb mixed Total = 844g

by @0xSero · backlist 2026-06-02 · rubric 63.0

90.

1/5

1/5 New paper: Representation Alignment Rests on Linear Structure with Guy Bresler and Yury Polyanskiy The Platonic Representation Hypothesis (PRH) posits that representations of data from different models converge as model performance imp

by @BangachevKiril (Kiril Bangachev) · backlist 2026-06-02 · rubric 63.0