Backlist — 02 Jun 2026 UTC

Top 90 curated tweets ranked for substance on 02 Jun 2026 UTC.

16.

Implementations encode years of edge cases, operational lessons, failed experiments, and invariants, which is why clean rewrites often discard the real system they are trying to replace

by (chad fowler) · backlist 2026-06-02 · rubric 30.0
38.

(x.com)

So /goal is awesome Over the past few weeks I used @PrimeIntellect to train a 149M late interaction model based on GTE-ModernColBERT-v1 using PyLate, focused on clause extraction from legal contracts. On the MLEB benchmark it does well

by (Kevin Madura) · backlist 2026-06-02 · rubric 82.0
39.

SDPO++ for Continual Learning Day 5 of Trajectory, we modify Self Distillation Policy Optimization for long horizon agentic tasks. SDPO is a promising route. It learns from a single trajectory, with no group required and failures still p

by (Ronak Malde) · backlist 2026-06-02 · rubric 79.0
65.

Just imagine: you create a model (MinMax M.27) that scores the SAME results as Opus 4.6 on SWE Bench PRO. But when we create a benchmark where your model didn't train, you literally score 0. Because MinMax models are shit, and incomparabl

by (Melvyn • Builder) · backlist 2026-06-02 · rubric 71.0
74.

(x.com)

Congrats to @JetBrains on Mellum2-12B-A2.5B-Thinking, an open-source 12B MoE that activates just 2.5B params, handling both natural language and code with a 128K context. Mellum2 runs natively in vLLM from day 0, with reasoning parser a

by (vLLM) · backlist 2026-06-02 · rubric 68.0
76.

(x.com)

so @morgallant has optimized FTS tokenization throughput to 423 MiB/s and open-sourced it ( https:// github.com/turbopuffer/al yze …) I keep telling him that it would be really high agency to get to DRAM bandwidth (~100 GiB/s), and he ke

by (Simon Eskildsen) · backlist 2026-06-02 · rubric 68.0
89.

I got 2 intel bad boys on Friday. At this point I’m struggling to find more power for all this. - 4x 6000s - 1x DGX Spark - AMD Strix - 4x 3090 - 2x intel arc b70 - Mac mini 16gb - MacBook Pro 32gb 544gb VRAM 300gb mixed Total = 844g

by · backlist 2026-06-02 · rubric 63.0
90.

1/5 New paper: Representation Alignment Rests on Linear Structure with Guy Bresler and Yury Polyanskiy The Platonic Representation Hypothesis (PRH) posits that representations of data from different models converge as model performance imp

by (Kiril Bangachev) · backlist 2026-06-02 · rubric 63.0