Stable-DiffCoder: 코드 확산 대규모 언어 모델의 최전선을 확장하다

초록

확산 기반 언어 모델(DLLM)은 자기회귀(AR) 모델 대비 비순차적 블록 단위 생성과 더 풍부한 데이터 재사용이 가능하지만, 동일 예산 기준 기존 코드 DLLM은 여전히 강력한 AR 베이스라인에 뒤처집니다. 본 연구는 통제된 환경에서 이 설정을 재검토하고, Seed-Coder 아키텍처·데이터·학습 파이프라인을 재활용하는 블록 확산 코드 모델 Stable-DiffCoder를 제안합니다. 효율적인 지식 학습과 안정적인 학습을 위해 맞춤형 워밍업 및 블록 단위 클리핑 노이즈 스케줄로 강화된 블록 확산 연속 사전학습(CPT) 단계를 도입했습니다. 동일 데이터와 아키텍처에서 Stable-DiffCoder는 다양한 코드 벤치마크 전체에서 AR 대조군을 능가했습니다. 나아가 CPT와 지도 미세조정 단계만으로 Stable-DiffCoder는 다양한 ~8B 규모 AR 및 DLLM보다 우수한 성능을 달성하며, 확산 기반 학습이 코드 모델링 품질을 AR 단독 학습을 넘어 개선할 수 있음을 입증했습니다. 또한 확산 기반 임의 순서 모델링은 편집 및 추론을 위한 구조화된 코드 모델링을 개선하고, 데이터 증강을 통해 저자원 프로그래밍 언어에도 이점을 제공합니다.

English

Diffusion-based language models (DLLMs) offer non-sequential, block-wise generation and richer data reuse compared to autoregressive (AR) models, but existing code DLLMs still lag behind strong AR baselines under comparable budgets. We revisit this setting in a controlled study and introduce Stable-DiffCoder, a block diffusion code model that reuses the Seed-Coder architecture, data, and training pipeline. To enable efficient knowledge learning and stable training, we incorporate a block diffusion continual pretraining (CPT) stage enhanced by a tailored warmup and block-wise clipped noise schedule. Under the same data and architecture, Stable-DiffCoder overall outperforms its AR counterpart on a broad suite of code benchmarks. Moreover, relying only on the CPT and supervised fine-tuning stages, Stable-DiffCoder achieves stronger performance than a wide range of \~8B ARs and DLLMs, demonstrating that diffusion-based training can improve code modeling quality beyond AR training alone. Moreover, diffusion-based any-order modeling improves structured code modeling for editing and reasoning, and through data augmentation, benefits low-resource coding languages.

Stable-DiffCoder: 코드 확산 대규모 언어 모델의 최전선을 확장하다

Stable-DiffCoder: Pushing the Frontier of Code Diffusion Large Language Model

초록

Support