@maximelabonne on Backlist

36.

We patched LFM2.5-350M (pre-trained on 28T tokens) to transform a causal decoder into a bidirectional encoder.

We patched LFM2.5-350M (pre-trained on 28T tokens) to transform a causal decoder into a bidirectional encoder. It worked extremely well: both Embedding and ColBERT models get best-in-class performance, especially for multi/cross-lingual ta

by @maximelabonne (Maxime Labonne) · backlist 2026-06-18 · rubric 84.0