ChatPaper.aiChatPaper

CoDA:基於擴散適應的語言模型編碼

CoDA: Coding LM via Diffusion Adaptation

September 27, 2025
作者: Haolin Chen, Shiyu Wang, Can Qin, Bo Pang, Zuxin Liu, Jielin Qiu, Jianguo Zhang, Yingbo Zhou, Zeyuan Chen, Ran Xu, Shelby Heinecke, Silvio Savarese, Caiming Xiong, Huan Wang, Weiran Yao
cs.AI

摘要

扩散语言模型承诺提供自回归编码器所缺乏的双向上下文和填充能力,然而实际系统仍显笨重。我们推出了CoDA,一个拥有17亿参数、在TPU上训练且具备完全开源训练流程的扩散编码器。CoDA将大规模扩散预训练与以代码为中心的中期训练及指令微调相结合,实现了保持推理延迟竞争力的信心引导采样。在Humaneval、MBPP和EvalPlus基准测试中,CoDA-1.7B-Instruct的表现与参数高达70亿的扩散模型相当或更优。我们的发布内容包括模型检查点、评估框架及TPU训练流程,旨在加速基于轻量级扩散编码助手的研究进展。
English
Diffusion language models promise bidirectional context and infilling capabilities that autoregressive coders lack, yet practical systems remain heavyweight. We introduce CoDA, a 1.7B-parameter diffusion coder trained on TPU with a fully open-source training pipeline. CoDA pairs large-scale diffusion pre-training with code-centric mid-training and instruction tuning, enabling confidence-guided sampling that keeps inference latency competitive. On Humaneval, MBPP, and EvalPlus, CoDA-1.7B-Instruct matches or surpasses diffusion models up to 7B parameters. Our release includes model checkpoints, evaluation harnesses, and TPU training pipelines to accelerate research on lightweight diffusion-based coding assistants.
PDF342October 8, 2025