@artemg314 on Backlist

11.

Latent Context Language Models make long-context inference up to 8.8× faster

A 0.6B encoder compresses long context into latent vectors for a 4B decoder, reducing long-context cost while preserving accuracy

by @artemg314 (Artem Gazizov) · backlist 2026-06-17 · rubric 72.0