ChatPaper.aiChatPaper

种子扩散:一种具备高速推理能力的大规模扩散语言模型

Seed Diffusion: A Large-Scale Diffusion Language Model with High-Speed Inference

August 4, 2025
作者: Yuxuan Song, Zheng Zhang, Cheng Luo, Pengyang Gao, Fan Xia, Hao Luo, Zheng Li, Yuehang Yang, Hongli Yu, Xingwei Qu, Yuwei Fu, Jing Su, Ge Zhang, Wenhao Huang, Mingxuan Wang, Lin Yan, Xiaoying Jia, Jingjing Liu, Wei-Ying Ma, Ya-Qin Zhang, Yonghui Wu, Hao Zhou
cs.AI

摘要

我们推出Seed Diffusion Preview,这是一款基于离散状态扩散的大规模语言模型,具备极其快速的推理速度。得益于非顺序、并行的生成方式,离散扩散模型显著提升了速度,有效缓解了逐令牌解码固有的延迟问题,正如近期研究(如Mercury Coder、Gemini Diffusion)所展示的那样。Seed Diffusion Preview在H20 GPU上实现了每秒2,146个令牌的推理速度,同时在一系列标准代码评估基准测试中保持了竞争力,其速度远超当前的Mercury和Gemini Diffusion模型,在代码模型的速度-质量帕累托前沿上树立了新的技术标杆。
English
We present Seed Diffusion Preview, a large-scale language model based on discrete-state diffusion, offering remarkably fast inference speed. Thanks to non-sequential, parallel generation, discrete diffusion models provide a notable speedup to mitigate the inherent latency of token-by-token decoding, as demonstrated recently (e.g., Mercury Coder, Gemini Diffusion). Seed Diffusion Preview achieves an inference speed of 2,146 token/s over H20 GPUs while maintaining competitive performance across a sweep of standard code evaluation benchmarks, significantly faster than contemporary Mercury and Gemini Diffusion, establishing new state of the art on the speed-quality Pareto frontier for code models.
PDF9113August 6, 2025