种子扩散：一种具备高速推理能力的大规模扩散语言模型

摘要

我们推出Seed Diffusion Preview，这是一款基于离散状态扩散的大规模语言模型，具备极其快速的推理速度。得益于非顺序、并行的生成方式，离散扩散模型显著提升了速度，有效缓解了逐令牌解码固有的延迟问题，正如近期研究（如Mercury Coder、Gemini Diffusion）所展示的那样。Seed Diffusion Preview在H20 GPU上实现了每秒2,146个令牌的推理速度，同时在一系列标准代码评估基准测试中保持了竞争力，其速度远超当前的Mercury和Gemini Diffusion模型，在代码模型的速度-质量帕累托前沿上树立了新的技术标杆。

English

We present Seed Diffusion Preview, a large-scale language model based on discrete-state diffusion, offering remarkably fast inference speed. Thanks to non-sequential, parallel generation, discrete diffusion models provide a notable speedup to mitigate the inherent latency of token-by-token decoding, as demonstrated recently (e.g., Mercury Coder, Gemini Diffusion). Seed Diffusion Preview achieves an inference speed of 2,146 token/s over H20 GPUs while maintaining competitive performance across a sweep of standard code evaluation benchmarks, significantly faster than contemporary Mercury and Gemini Diffusion, establishing new state of the art on the speed-quality Pareto frontier for code models.