Backlist — 12 Jun 2026 UTC

1.

Biased RSA keys factored via polynomial factorization

RSA keys biased toward zero bits turned integer factorization into an easier polynomial factorization problem, exposing hundreds of real-world keys tied to a patched CompleteFTP bug

by @trailofbits (Trail of Bits) · backlist 2026-06-12 · rubric 88.0

2.

400+ Arch Linux AUR packages compromised by a fake maintainer

An attacker adopted orphaned AUR packages and inserted infostealer malware and a rootkit, making old community-maintained packages an immediate audit target

by @IntCyberDigest (International Cyber Digest) · backlist 2026-06-12 · rubric 72.0

3.

Paradigm bought the solution to Kryptos and encrypted it again

The final unsolved section of the CIA’s Kryptos sculpture became a new public cryptography challenge after the solution was bought from the artist and re-encrypted

by @paradigm (Paradigm) · backlist 2026-06-12 · rubric 72.0

4.

KPMG's AI success report cited hallucinated case studies

A Big Four report praising business AI adoption used fabricated case studies, turning hallucination from an abstract risk into an institutional failure mode

by @anneapplebaum (Anne Applebaum) · backlist 2026-06-12 · rubric 72.0

5.

An in-browser CAD language that compiles to WebGPU shaders

A browser CAD system can now switch between CPU WASM and GPU WebGPU execution for dual contouring and marching cubes, giving interactive geometry tools a major local performance path

by @unixpickle (Alex Nichol) · backlist 2026-06-12 · rubric 88.0

6.

MiniMax M3 open weights: 428B parameters, 1M context (t.co)

MiniMax released a 428B-parameter open-weight model with 23B active parameters and MiniMax Sparse Attention for million-token contexts

by @MiniMax_AI (MiniMax (official)) · backlist 2026-06-12 · rubric 88.0

7.

SpaceX and the founder cost of capital (x.com)

Musk raised roughly $10B in primary SpaceX equity while retaining top-decile founder ownership, showing how capital efficiency can compound into unusually low dilution

by @credistick (Dan Gray) · backlist 2026-06-12 · rubric 66.0

8.

FrontierMath v2: 42% of problems had errors fixed

Epoch’s audit corrected errors in 42% of FrontierMath Tiers 1–4 problems, raising scores while leaving rankings broadly similar

by @EpochAIResearch (Epoch AI) · backlist 2026-06-12 · rubric 91.0

9.

The Jones Act's goals versus outcomes

A maritime law framed around security resilience and industrial self-sufficiency has produced a tiny defense-irrelevant shipbuilding base and higher goods costs

by @johnarnold (John Arnold) · backlist 2026-06-12 · rubric 82.0

10.

World Tracing: pixel-aligned geometry from one image (t.co)

World Tracing generates complete 3D geometry from a single image while preserving a trace from every 3D point back to its source pixel

by @HaoZhang623 (Hao Zhang) · backlist 2026-06-12 · rubric 86.0

11.

Playbit apps running in web browsers via WASM

Playbit is bringing its app runtime to browsers with both the runtime and apps running as WASM, making the browser act more like a portable OS layer

by @rsms (Rasmus Andersson) · backlist 2026-06-12 · rubric 82.0

12.

22-DoF humanoid balances a six-link pendulum using RL (x.com)

A 22-degree-of-freedom humanoid controlled a six-link pendulum attached to its torso using only its own body actuation, without a sweep and with first-seed success

by @judokach (Nikita Kachaev) · backlist 2026-06-12 · rubric 90.0

13.

River Workflows v2

River’s largest revision adds 60% faster throughput, 20x faster backlog draining, arbitrary out-of-band signal waits, timers, and CEL wait expressions

by @brandur (Brandur) · backlist 2026-06-12 · rubric 89.0

14.

MIT's autonomous robotics textbook is free on GitHub

MIT Press made a full autonomous robotics textbook freely available, covering mechanisms, kinematics, sensors, planning, localization, vision, and neural networks

by @IlirAliu_ (Ilir Aliu) · backlist 2026-06-12 · rubric 42.0

15.

A new cryoEM imaging technique for cellular biology

A new cryoEM imaging technique improves the raw measurement layer for cellular biology, which changes what researchers can directly observe before modeling begins

by @charlesxjyang (Charles Yang) · backlist 2026-06-12 · rubric 62.0

16.

A single model outperformed stock-specific models on HFT data

Two Sigma found that one model trained across hundreds of stocks beat separate models trained per stock, suggesting useful shared structure in high-frequency market data

by @GoshawkTrades (Goshawk Trades) · backlist 2026-06-12 · rubric 72.0

17.

A balcony microphone, bird detection, and a framed e-ink collage

A balcony microphone identifies passing birds and updates a framed color e-ink display with a daily collage of the species heard nearby

by @WarnerTeddy (Teddy) · backlist 2026-06-12 · rubric 86.0

18.

PhyCo: video models with controllable physics (t.co)

PhyCo lets a video generation model take friction, restitution, deformation, and force as inputs instead of leaving the physical behavior of a scene implicit

by @nnsriram97 (Sriram Narayanan) · backlist 2026-06-12 · rubric 82.0

19.

Creative time density

Creative time density names the awe generated when hundreds or thousands of person-hours are compressed into a specific artifact, performance, place, or experience

by @andy_matuschak (Andy Matuschak) · backlist 2026-06-12 · rubric 78.0

20.

Compilecat: an opt-in JS performance compiler

Compilecat applies closure-style optimizations such as inlining, loop unrolling, scalar replacement, and constant inlining under explicit annotations, making one Chrome example about 4x faster

by @isaac_mason_ (isaacmason) · backlist 2026-06-12 · rubric 78.0

21.

A $1,499 laundry-folding robot (x.com)

A compact autonomous laundry-folding robot with onboard private compute, apartment-friendly dimensions, and human-quality folds is being offered for $1,499 in the US and Canada

by @Deeplearner2 (Usman Roshan) · backlist 2026-06-12 · rubric 74.0

22.

Google Earth flight simulator is now on the web (t.co)

Google Earth brought flight simulator to the web alongside desktop-grade features like elevation profiles and expanded import support

by @googleearth (Google Earth) · backlist 2026-06-12 · rubric 46.0

23.

Coinbase launches 24/7 US-regulated gold and silver futures

US-regulated gold and silver futures are moving to continuous weekend trading on Coinbase, with oil and other contracts planned next

by @coinbase (Coinbase ) · backlist 2026-06-12 · rubric 66.0

24.

Kimi-K2.7-Code open-sourced with SGLang day-zero support (x.com)

Kimi’s open coding model adds long multi-step coding gains and lower reasoning-token use while keeping the K2.5/K2.6 architecture deployable in existing SGLang setups

by @lmsysorg (LMSYS Org) · backlist 2026-06-12 · rubric 86.0

25.

Agent-held device keys now sign AgentKeys cap minting

AgentKeys cap minting now requires a request-bound client signature from the agent-held device key, so a broker can coordinate but not unilaterally authorize the worker

by @heimaNetwork (Heima) · backlist 2026-06-12 · rubric 82.0

26.

Apple unified OS version numbers for @available engineering (x.com)

Unifying Apple OS version numbers reduces the boilerplate and mismatch risk in Swift availability checks across iOS, macOS, watchOS, tvOS, and visionOS

by @kemchenj (Kem Chen) · backlist 2026-06-12 · rubric 68.0

27.

The Skyline Project: open-source building-history maps

The Skyline Project offers interactive open-source maps of NYC, San Francisco, and London where buildings can be searched by history, architecture, and year

by @_Shark_byte (Perly ) · backlist 2026-06-12 · rubric 88.0

28.

Cheapo monitors as a source of advanced optical films

Cheap monitors contain diffusers, polarizers, prismatic sheets, and high-albedo matte panels that cost far more when bought individually from optical suppliers

by @tkanarsky (Tim Kanarsky) · backlist 2026-06-12 · rubric 76.0

29.

SCALE-CLIP maps endogenous RNA-binding protein binding at scale

SCALE-CLIP enables direct comparison of endogenous RNA-binding protein binding across many factors and links large-scale binding maps to splicing analysis

by @YoshimiLab (Yoshimi Lab (NCC Official Account)) · backlist 2026-06-12 · rubric 72.0

30.

2048 in under 90 lines of Haskell (t.co)

A compact Haskell implementation of 2048 fits the whole game into fewer than 90 lines and comes with an explanatory article

by @lotz84_ (lotz) · backlist 2026-06-12 · rubric 78.0

31.

Result #32: @mihai673 has achieved a 30-step improvement over the old 2026/05/09 record by adding a SODA (Pethick et al. 2026)-style anchor towards init. It is unknown whether this technique can also improve the current record. 2/5

by @kellerjordan0 (Keller Jordan) · backlist 2026-06-12 · rubric 91.0

32.

New record on GB300 NVL72: SGLang exceeds 12K tok/s per GPU on DeepSeek V4 Pro 1.6T (FP4, 8K/1K), orchestrated wi… (x.com)

New record on GB300 NVL72: SGLang exceeds 12K tok/s per GPU on DeepSeek V4 Pro 1.6T (FP4, 8K/1K), orchestrated with NVIDIA Dynamo (SGLang) and MTP. Per @SemiAnalysis_ InferenceX benchmarks, performance stays strong across the entire in

by @lmsysorg (LMSYS Org) · backlist 2026-06-12 · rubric 90.0

33.

Five recent notable Modded-NanoGPT optimization results:

Five recent notable Modded-NanoGPT optimization results: Result #31: Kai Lion and Florian Hübler have improved their Muown-based run from 3075 to 2995 steps by adding NorMuon & ContraMuon modifications. 1/5

by @kellerjordan0 (Keller Jordan) · backlist 2026-06-12 · rubric 89.0

34.

Tokenminning: Token⋅Min⋅ning

Tokenminning: Token⋅Min⋅ning Get the *same quality* work done in the *same time* as your tokenmaxxing peers but with the LEAST amount of tokens Tokenmaxxing is too easy to hack (just run things in loop, in parallel, etc.) What are some g

by @GuangyuRobert (Robert Yang) · backlist 2026-06-12 · rubric 89.0

35.

I confess: you can't dynamically resize a (x.com)

I confess: you can't dynamically resize a @modal sandbox. Because you don't have to Sandbox workloads are spiky: install, wait, spike, wait We built our runtime to be *burstable*. Request the min & burst above it when your workload s

by @AAAzzam (Adam Azzam) · backlist 2026-06-12 · rubric 89.0

36.

DiffusionGemma can now run at 2000+ tokens/sec! (t.co)

DiffusionGemma can now run at 2000+ tokens/sec! We made local DiffusionGemma inference 1.8× faster. Run it on 18GB RAM via Unsloth Studio. GitHub: https:// github.com/unslothai/unsl oth … Guide: https:// unsloth.ai/docs/models/di ffus

by @UnslothAI (Unsloth AI) · backlist 2026-06-12 · rubric 89.0

37.

This data is wrong right now. Artemis are likely using time and sales files from Polymarket US which are currentl…

This data is wrong right now. Artemis are likely using time and sales files from Polymarket US which are currently overcounting World Cup volume by a factor of 100x. This is an error on Polymarket's side. June 10th Poly US did 79M ish. Int

by @datadashboards (dash) · backlist 2026-06-12 · rubric 89.0

38.

implemented q-chunking on top of it

implemented q-chunking on top of it offline only for now already converges significantly faster: 84% at 50k steps vs 56% for vanilla fql online fine-tuning + harder envs coming next

by @PrajwalAvhad8 (crisiumnih) · backlist 2026-06-12 · rubric 89.0

39.

This Manhattan pixel map was the biggest pain in the ███ with the best payoff: a 4 x 8 ft wool blanket

This Manhattan pixel map was the biggest pain in the ███ with the best payoff: a 4 x 8 ft wool blanket I have a whole new respect for pixel artists, they actually have to sweat details. I usually map big and shrink so the visual-overwhelm

by @youwillmakemaps (Evan Applegate) · backlist 2026-06-12 · rubric 89.0

40.

over the course of adding features to this app, fable found one difficult. it turns out a certain apple API for p…

over the course of adding features to this app, fable found one difficult. it turns out a certain apple API for programmatically moving windows between spaces silently stopped working 2 years ago. it found extensive discussion about this an

by @tenobrus (Tenobrus) · backlist 2026-06-12 · rubric 88.0

41.

I created this new speedrun track, which compares results in terms of steps rather than wallclock, specifically t…

I created this new speedrun track, which compares results in terms of steps rather than wallclock, specifically to give a fair chance to optimizers other than Muon. Happy to see the resulting accumulation of public knowledge!

by @kellerjordan0 (Keller Jordan) · backlist 2026-06-12 · rubric 88.0

42.

MiniMax M3, Open-Weight, Now On Hugging Face (t.co)

MiniMax M3, Open-Weight, Now On Hugging Face Weights: https:// huggingface.co/MiniMaxAI/Mini Max-M3 … MiniMax Sparse Attention: https:// huggingface.co/papers/2606.13 392 …

by @MiniMax_AI (MiniMax (official)) · backlist 2026-06-12 · rubric 88.0

43.

Now that I have your attention by posting this spinning point cloud GIF, I'd like to propose a litmus test for AI…

Now that I have your attention by posting this spinning point cloud GIF, I'd like to propose a litmus test for AI mechanistic interpretability research. You might call it the "interp hammer" test. If the things achieved by a mechanistic in

by @StephenLCasper (Cas (Stephen Casper)) · backlist 2026-06-12 · rubric 88.0

44.

#つぶやきGLSL float i,e,R,s;vec3 q,p,d=vec3(FC.xy/r.8-vec2(.4,-.6),1);for(q.zy--;i++<80.;){o.rgb+=hsv(q.z,.5,min(es…

#つぶやきGLSL float i,e,R,s;vec3 q,p,d=vec3(FC.xy/r*.8-vec2(.4,-.6),1);for(q.zy--;i++<80.;){o.rgb+=hsv(q.z,.5,min(e*s-.3,1.)/35.);s=5.;p=q+=d*e*R*.3;p=vec3(log(R=length(p*1.3)),exp2(-p.z/R),atan(p.y,p.x)-t*.3);for(e=--p.y;s<1e3;s+=s)e+=cos(dot(

by @YoheiNishitsuji (Yohei Nishitsuji) · backlist 2026-06-12 · rubric 88.0

45.

Two insights from LeapAlign: (x.com)

Two insights from LeapAlign: 1. Gradient descent, rather than GRPO, is native to diffusion post-training. 2. Early generation steps should be trained, such that image layout can be better optimized. Thanks @hillbig for posting this work.

by @LiangZheng_06 (Liang Zheng) · backlist 2026-06-12 · rubric 88.0

46.

GPT-5.5-xhigh's FrontierMath 4 score jumped from 35% to 73% after EpochAI fixed errors in the benchmark

by @scaling01 (Lisan al Gaib) · backlist 2026-06-12 · rubric 86.0

47.

We’re extending Harvey's Legal Agent Benchmark (LAB) to in-house contracting.

We’re extending Harvey's Legal Agent Benchmark (LAB) to in-house contracting. Contracting is the highest-volume workstream for in-house legal teams, where a huge amount of business risk gets negotiated into binding agreements. Benchmarki

by @gabepereyra (Gabe Pereyra) · backlist 2026-06-12 · rubric 86.0

48.

Introducing Gemini-SQL2, our breakthrough text-to-SQL capability powered by Gemini 3.1 Pro! We've achieved state-…

Introducing Gemini-SQL2, our breakthrough text-to-SQL capability powered by Gemini 3.1 Pro! We've achieved state-of-the-art results on the highly competitive BIRD benchmark, translating natural language into execution-ready SQL queries.

by @GoogleResearch (Google Research) · backlist 2026-06-12 · rubric 86.0

49.

Big updates for InferenceBench v1.0.1! (t.co)

Big updates for InferenceBench v1.0.1! Some highlights: - 10 more entries to the leaderboard, including Fable 5, Opus 4.8, Kimi 2.6, and Gemini 3.5 Flash - Re-scoring / Re-evaluation of select models See the changes for yourself at: htt

by @jehyeoky248 (Jehyeok Yeon @ ICML 2026 ) · backlist 2026-06-12 · rubric 86.0

50.

Two steps to SOTA-level depth estimation: a strong T2I model + a simple post-training recipe. No bespoke depth ar… (x.com)

Two steps to SOTA-level depth estimation: a strong T2I model + a simple post-training recipe. No bespoke depth architectures or complex pipelines needed -- the 3D understanding is already in the prior. Fantastic work led by @BDuisterhof !

by @KeunhongP (Keunhong Park) · backlist 2026-06-12 · rubric 86.0

51.

Claude 5 Fable (Ultracode)

Claude 5 Fable (Ultracode) "Make a playable alpine glacial valley at sunrise" No meshes or models. Everything you see is math. Fable screenshotted its own work and iterated. Took ~30 mins, ~500k tokens, ~2500 lines of code, and ~$25. Ext

by @deedydas (Deedy) · backlist 2026-06-12 · rubric 86.0

52.

I had a lot of Fable tokens to use up before my weekly reset, so I made this live 3D map of London with Three.js

I had a lot of Fable tokens to use up before my weekly reset, so I made this live 3D map of London with Three.js Every train, bus, boat and plane is real and live right now! - Tube, bus and riverboat data from TfL - National Rail trains f

by @jamespotter (James Potter (rephonic.com)) · backlist 2026-06-12 · rubric 86.0

53.

While everyone talks about Mythos vs GPT-5.5, we've tested other near SOTA models on our ErdosBench.

While everyone talks about Mythos vs GPT-5.5, we've tested other near SOTA models on our ErdosBench. Smoke test on 14 problems with 7 models: Kimi K2.6, Gemini 3.1 Pro, GLM 5.1, MiniMax M3, DeepSeek V4 Pro, Nemotron 3 Ultra and Gemma 3 27

by @prz_chojecki (Przemek Chojecki | PC) · backlist 2026-06-12 · rubric 86.0

54.

PROJECT NULLFRAME: A live telemetry dashboard brought to life with Fable 5 - in Nothing's design language. (x.com)

PROJECT NULLFRAME: A live telemetry dashboard brought to life with Fable 5 - in Nothing's design language. Your real fps, battery + network; your cursor becomes a seismograph. It tells you when it's simulating. Prompt inspo og @dominikmar

by @mickces (mi©k) · backlist 2026-06-12 · rubric 86.0

55.

Very excited to see that the core idea of DiffusionGemma directly stems from our work, Residual Context Diffusion…

Very excited to see that the core idea of DiffusionGemma directly stems from our work, Residual Context Diffusion (arXiv:2601.22954)! Code- and architecture-level comparisons are attached. RCD is accepted to ICML 2026! See you in Seoul!

by @yuezhouhu (Yuezhou Hu) · backlist 2026-06-12 · rubric 86.0

56.

You can pick up a baby bird and put it back in the nest. The parents won't smell your hands and abandon it. Songb…

You can pick up a baby bird and put it back in the nest. The parents won't smell your hands and abandon it. Songbirds barely smell anything, and that old "the mother will reject it" line is one of the most worst wildlife myths going. If yo

by @giveashitnature (Give A Shit About Nature) · backlist 2026-06-12 · rubric 86.0

57.

Context Arena: Added (x.com)

Context Arena: Added @AnthropicAI 's Claude Opus 4.8 on 8-needle GDM-MRCRv2. Thanks @OpenRouter for the credits to run Opus 4.8 @ max. All results at: https:// contextarena.ai Opus 4.8 (max reasoning) lands #2 on AUC@128k, behind only

by @DillonUzar (Dillon Uzar) · backlist 2026-06-12 · rubric 86.0

58.

I'm rate limiting the overall amount of billable ledger entries, so there's an effective max billing per minute …

I'm rate limiting the overall amount of billable ledger entries, so there's an effective max billing per minute and per hour. If we see spamming we don't bill the advertisers. Trying to be as fair as possible. It's fair play, but don't ru

by @andrewmccalip (Andrew McCalip) · backlist 2026-06-12 · rubric 86.0

59.

A beautiful "software factory" with its own "software byproducts".

A beautiful "software factory" with its own "software byproducts". As Fable generates 100% Cloudflare IaC coverage, it also produces a perfectly patched API spec and Effect SDK. All important errors and fixed data types are discovered fr

by @samgoodwin89 (sam) · backlist 2026-06-12 · rubric 86.0

60.

The performance win is pretty huge. Even for really simple shapes, just being able to run dual contouring in a ma…

The performance win is pretty huge. Even for really simple shapes, just being able to run dual contouring in a massively parallel environment gives like a ~3x speed improvement.

by @unixpickle (Alex Nichol) · backlist 2026-06-12 · rubric 86.0

61.

Built a browser tool called Blobtrack in (x.com)

Built a browser tool called Blobtrack in @cursor_ai so I don’t have to hand-keyframe surveillance boxes in After Effects for hours. Just drag your box and move frames with arrow keys. Added a few fun fx as well.

by @connorgrasso_ (Connor) · backlist 2026-06-12 · rubric 86.0

62.

what you get from STV+ViL can be thought of as a "transmutation" of OPSD's privileged self-teacher supervision --…

what you get from STV+ViL can be thought of as a "transmutation" of OPSD's privileged self-teacher supervision -- instead of distribution matching, you purify and reify it into ICL via corrective verbal feedback.

by @infinitefun_ (sean lee) · backlist 2026-06-12 · rubric 86.0

63.

Red sprites from last night.

Red sprites from last night. Last night was crazy. I captured over 60 individual sprite events. Captured from southern Minnesota looking toward the storm over Missouri / Illinois.

by @EricTheSpaceCat (EricTheCat ) · backlist 2026-06-12 · rubric 86.0

64.

at amazon in 2023 they rejected my text to SQL tool with Claude 2 because it had a 70% success rate and needed to…

at amazon in 2023 they rejected my text to SQL tool with Claude 2 because it had a 70% success rate and needed to be 90% to deploy to prod

by @yaboilyrical (nightwing) · backlist 2026-06-12 · rubric 84.0

65.

People frequently ask me how many tasks a benchmark should have. There's no exact answer but here's my intuition-

People frequently ask me how many tasks a benchmark should have. There's no exact answer but here's my intuition- (tl;dr aim for 300-500 tasks)

by @OfirPress (Ofir Press) · backlist 2026-06-12 · rubric 84.0

66.

long-horizon coding is the future

by @MangQiuyang (Qiuyang Mang) · backlist 2026-06-12 · rubric 84.0

67.

So it begins - Stylocard V2 is in the works

So it begins - Stylocard V2 is in the works Specs: - 20 key stylophone w/ ENIG plating for style points - RISC-V CH32X035 MCU for the lolz - Piezo buzzer for sound - CR2016 coin cell battery - MIDI over USB-C

by @maciint0sh (Ali A) · backlist 2026-06-12 · rubric 84.0

68.

Reward eng should be the last resort in RL. Curriculum + simple reward

by @jsuarez (Joseph Suarez ) · backlist 2026-06-12 · rubric 84.0

69.

startup data point:

startup data point: founder shows me a customer’s support queue at 9:18pm. 43 tickets. 12 refund edge cases. 3 policy exceptions. 1 angry enterprise account. then asks: “which of these still needs a human?” that is a better pitch than 9

by @geoffreywoo (GEOFF) · backlist 2026-06-12 · rubric 84.0

70.

want to point out a few really interesting things here

want to point out a few really interesting things here 1. Claude Code is actually the worst performing harness when using the same model, significantly behind opencode and cursor cli this is the core reason i've been against the LLM compa

by @kunchenguid (Kun Chen) · backlist 2026-06-12 · rubric 84.0

71.

totally forgot the team built 'sentry local' which takes our tech from Spotlight and bundles it into the CLI, gi…

totally forgot the team built 'sentry local' which takes our tech from Spotlight and bundles it into the CLI, giving you (aka your robot buddies) access to Sentry telemetry (hello traces)

by @zeeg (David Cramer) · backlist 2026-06-12 · rubric 84.0

72.

Icon Museum now features a Wall of Icons, so you can explore the full collection on one endless canvas

by @chroxify (Christo) · backlist 2026-06-12 · rubric 84.0

73.

Your favorite style of putter might be costing you strokes.

Your favorite style of putter might be costing you strokes. We tested 79 putters. The top 15 were all zero-torque. Here’s what the data showed

by @MyGolfSpy (MYGOLFSPY) · backlist 2026-06-12 · rubric 83.0

74.

I tried this so you don’t have to.

I tried this so you don’t have to. At the end, I got: - 10,000 impressions - $600 spent - 0-400 clicks (tracking isn’t very good) - 0 conversions I’m probably not going to spend more on this platform at the current stage because it’s a ve

by @anshnanda (Ansh Nanda) · backlist 2026-06-12 · rubric 83.0

75.

it brings me no joy to report I spend a year wondering why I was constantly sleepy and had a low sleep score on m…

it brings me no joy to report I spend a year wondering why I was constantly sleepy and had a low sleep score on my whoop that was totally cured by simply stop wearing the whoop.

by @WillManidis (Will Manidis) · backlist 2026-06-12 · rubric 83.0

76.

Very cool paper on the "hacker-fixer loop" by (x.com)

Very cool paper on the "hacker-fixer loop" by @fjzzq2002 et al. A 3-agent LLM system that automatically hardens benchmark verifiers against reward hacking: 1. Hacker tries to pass the verifier without solving the task. 2. Fixer patche

by @RobertTLange (Robert Lange) · backlist 2026-06-12 · rubric 83.0

77.

the most interesting thing here IMO is that this involves On-Policy Self Distillation, but the distillation gradi…

the most interesting thing here IMO is that this involves On-Policy Self Distillation, but the distillation gradient never touches the generator. this airgaps us against OPSD's biggest weakness — bias introduced by having non-causal privi

by @infinitefun_ (sean lee) · backlist 2026-06-12 · rubric 83.0

78.

SCOOP: Meta plans to clamp down on skyrocketing AI costs inside the company by imposing limits on employees’ toke…

SCOOP: Meta plans to clamp down on skyrocketing AI costs inside the company by imposing limits on employees’ token usage, the company told staff in a memo on Tuesday, just weeks after it pushed them to adopt AI tools in their work.

by @jyoti_mann1 (Jyoti Mann) · backlist 2026-06-12 · rubric 82.0

79.

We’ve backfilled FrontierMath: Tiers 1–4 (v2) scores for a selection of notable models, including recent Claude O…

We’ve backfilled FrontierMath: Tiers 1–4 (v2) scores for a selection of notable models, including recent Claude Opus models. You can find these on our website. We will add scores for Claude Fable 5 and GPT Pro models shortly.

by @EpochAIResearch (Epoch AI) · backlist 2026-06-12 · rubric 82.0

80.

just launched scanner by endera -- paste a soundcloud/mixcloud link or upload audio, ACRCloud detects the full tr… (t.co)

just launched scanner by endera -- paste a soundcloud/mixcloud link or upload audio, ACRCloud detects the full tracklist, then each track gets enriched with BPM/key via http:// everysong.site and export as txt file for set planning. http

by @verymehari (Kiru) · backlist 2026-06-12 · rubric 82.0

81.

Already 90m deposits in Coinbase high yield earn product on (x.com)

Already 90m deposits in Coinbase high yield earn product on @base (30m from CB users) in 1 day. Nice collab between @coinbase , @SteakhouseFi & @ethena . The DeFi mullet keeps growing.

by @MerlinEgalite (Merlin Egalite ) · backlist 2026-06-12 · rubric 82.0

82.

we all had the same realization: fable is expensive, but great at orchestration.

we all had the same realization: fable is expensive, but great at orchestration. you get 95% of the power at 30% of the cost by letting it orchestrate other models this is why i built omegacode: your agent writes a script that can orche

by @sawyerhood (Sawyer Hood) · backlist 2026-06-12 · rubric 82.0

83.

The Renaissance of Sparse Attention (old dilated like Longformer/Longnet, compressed like DeepSeek, query-aware l… (x.com)

The Renaissance of Sparse Attention (old dilated like Longformer/Longnet, compressed like DeepSeek, query-aware like MiniMax) vs. Hot linear attention/recurrence: Two separate lines of long-context scaling. We have a series of works with @

by @XiuyingWei966 (Xiuying Wei) · backlist 2026-06-12 · rubric 82.0

84.

prediction:

prediction: agents will expose a funny lie in enterprise software. half the product surface was not there because users loved it. it was there because humans needed reminders, approvals, queues, status pages, nudges, and meetings to move

by @geoffreywoo (GEOFF) · backlist 2026-06-12 · rubric 82.0

85.

How I use Claude Code and Remotion to make animated diagrams.

How I use Claude Code and Remotion to make animated diagrams. Sorry, it's not a single prompt. 1. Find an input language the model knows well. For example, Mermaid for flowcharts. Claude writes it fluently, so it's my entry point. 2. Use

by @delba_oliveira (Delba) · backlist 2026-06-12 · rubric 82.0

86.

If you view a typical IPO pop as 20%, then the "bar" for SpaceX is $162 -- which is the current indicated open (a…

If you view a typical IPO pop as 20%, then the "bar" for SpaceX is $162 -- which is the current indicated open (although it's been falling)

by @danprimack (Dan Primack) · backlist 2026-06-12 · rubric 82.0

87.

AI Native companies scale by productizing everything - but for themselves.

AI Native companies scale by productizing everything - but for themselves. The customer, though, always buys a service.

by @ponnappa (Sidu Ponnappa) · backlist 2026-06-12 · rubric 82.0

88.

Mat got new Update, update your app

Mat got new Update, update your app Basic fixes + A Big one: collective mats! Invite up to 5 friends over iMessage and decorate one mat together, with live sync and widget updates. Plus: - cutout-letter alphabet everywhere - type words i

by @rahulbhadoriiya (Rahul Bhadoriya) · backlist 2026-06-12 · rubric 82.0

89.

Cool work on refining coarse VLM actions using a flow matching policy π(a₀ | o) → a₁ where a₀ ∼ N(0, 1) by first…

Cool work on refining coarse VLM actions using a flow matching policy π(a₀ | o) → a₁ where a₀ ∼ N(0, 1) by first reversing (inverting) the given coarse action a₁ via â₀ = π⁻¹(a₁ | o) and then reconstructing it in the forward direction i.e

by @siddancha (Siddharth Ancha) · backlist 2026-06-12 · rubric 82.0

90.

Raising Fund I/II? Be careful with the big consultants/asset managers.

Raising Fund I/II? Be careful with the big consultants/asset managers. If they do a small bite ($5-20M) but demand multiples in coinvest, they are screwing you and have no plans to be a real anchor. These LPs can write $50-500M. They are

by @endowment_eddie (Endowment Eddie) · backlist 2026-06-12 · rubric 81.0