19.
Liquid AI’s 350M multilingual retrieval models claim 1.5ms latency
Liquid’s LFM2.5 embedding and ColBERT models target ultra-fast multilingual search across 11 languages with end-to-end retrieval latency as low as 1.5ms
2 appearances on the backlist front page in the last 30 days.
Liquid’s LFM2.5 embedding and ColBERT models target ultra-fast multilingual search across 11 languages with end-to-end retrieval latency as low as 1.5ms
The bottleneck in LLM inference isn't compute. It's how fast you can move the weights. Our CTO Mathias Lechner, @mlech26l , joins Piotr Mazurek, @tugot17 , from our inference team, to discuss what actually limits token throughput and how