@_inception_ai on Backlist

37.

The fastest reasoning LLM is now in production on Baseten. (x.com)

The fastest reasoning LLM is now in production on Baseten. Mercury 2 is a diffusion LLM, so it generates tokens in parallel and hits 1,000+ tokens/sec on @NVIDIAAI GPUs, speeds that used to require specialized hardware. @augmentcode i

by @_inception_ai (Inception) · backlist 2026-06-11 · rubric 89.0