Pass@k를 위한 무료 점심? 확산 언어 모델을 위한 저비용 다양성 샘플링

초록

텍스트 생성에서 다양한 출력은 코드 생성 및 수학 문제 해결과 같은 복잡한 추론 과제에서 효과적인 탐색을 위해 필수적입니다. 이러한 Pass@k 문제는 해결 공간을 포괄하는 서로 다른 후보들로부터 이점을 얻습니다. 그러나 기존의 샘플링 접근법은 종종 반복적인 실패 모드에 계산 자원을 낭비합니다. Diffusion 언어 모델이 기존의 자기회귀 패러다임에 대한 경쟁력 있는 대안으로 부상했지만, 독립적인 샘플들이 유사한 모드로 수렴하는 이러한 중복 문제에 취약합니다. 이를 해결하기 위해 우리는 Diffusion 언어 모델의 생성 다양성을 향상시키는 학습이 필요 없고 저비용인 개입 방법을 제안합니다. 우리의 접근법은 배치 내 중간 샘플들을 순차적으로 수정하며, 각 샘플이 이전 샘플들의 특징 공간에서 반발하도록 하여 적극적으로 중복을 억제합니다. 재학습이나 빔 서치가 필요한 기존 방법과 달리, 우리의 전략은 무시할 만한 계산 오버헤드만을 발생시키면서 각 샘플이 배치에 고유한 관점을 제공하도록 보장합니다. 우리는 LLaDA-8B-Instruct 모델을 사용하여 HumanEval 및 GSM8K 벤치마크에서 우리의 방법을 평가합니다. 결과는 다양한 temperature 설정에서 Pass@k 성능과 다양성이 크게 향상됨을 보여줍니다. 샘플링 과정에 대한 간단한 수정으로, 우리의 방법은 다양한 해결책 탐색이 필요한 과제에서 현재와 미래의 Diffusion 언어 모델에 즉각적이고 저비용의 개선을 제공합니다. 우리는 코드를 https://github.com/sean-lamont/odd 에서 공개합니다.

English

Diverse outputs in text generation are necessary for effective exploration in complex reasoning tasks, such as code generation and mathematical problem solving. Such Pass@k problems benefit from distinct candidates covering the solution space. However, traditional sampling approaches often waste computational resources on repetitive failure modes. While Diffusion Language Models have emerged as a competitive alternative to the prevailing Autoregressive paradigm, they remain susceptible to this redundancy, with independent samples frequently collapsing into similar modes. To address this, we propose a training free, low cost intervention to enhance generative diversity in Diffusion Language Models. Our approach modifies intermediate samples in a batch sequentially, where each sample is repelled from the feature space of previous samples, actively penalising redundancy. Unlike prior methods that require retraining or beam search, our strategy incurs negligible computational overhead, while ensuring that each sample contributes a unique perspective to the batch. We evaluate our method on the HumanEval and GSM8K benchmarks using the LLaDA-8B-Instruct model. Our results demonstrate significantly improved diversity and Pass@k performance across various temperature settings. As a simple modification to the sampling process, our method offers an immediate, low-cost improvement for current and future Diffusion Language Models in tasks that benefit from diverse solution search. We make our code available at https://github.com/sean-lamont/odd.

Pass@k를 위한 무료 점심? 확산 언어 모델을 위한 저비용 다양성 샘플링

Free Lunch for Pass@k? Low Cost Diverse Sampling for Diffusion Language Models

초록

Support