vLLM 0.21 ships KV offload, Mooncake distributed KV, and DeepSeek V4 pipeline parallelism
The release bundles 367 commits from 202 contributors and targets current bottlenecks in reasoning-model serving, Blackwell MLA kernels, and distributed KV caches