Kimi-K2.7-Code open-sourced with SGLang day-zero support (x.com)
Kimi’s open coding model adds long multi-step coding gains and lower reasoning-token use while keeping the K2.5/K2.6 architecture deployable in existing SGLang setups
5 appearances on the backlist front page in the last 30 days.
Kimi’s open coding model adds long multi-step coding gains and lower reasoning-token use while keeping the K2.5/K2.6 architecture deployable in existing SGLang setups
New record on GB300 NVL72: SGLang exceeds 12K tok/s per GPU on DeepSeek V4 Pro 1.6T (FP4, 8K/1K), orchestrated with NVIDIA Dynamo (SGLang) and MTP. Per @SemiAnalysis_ InferenceX benchmarks, performance stays strong across the entire in
Meet Gemma 4 12B Unified from @googlegemma ! This is a 12B dense, encoder-free multimodal that runs text, image & audio natively on-device. Day-0 support is now live in SGLang! Encoder-free architecture: raw image patches + audio wavefo
Congrats to the @liquidai team on LFM2.5-8B-A1B! Day-0 support is now live in SGLang. - 8B MoE, 1.5B active - Fast tool calling, punches 4x its size - 128K context + better non-Latin support - Runs local, no API keys, no data leaving
SGLang v0.5.12.post1 is live This is a stability patch on top of v0.5.12, with 12 cherry-picks focused on DeepSeek V4, NIXL PD disaggregation, and Blackwell. DeepSeek V4 Fixed V4-Pro garbled text on single-token decode (B200/B300) Fixed