Pass@kに無料ランチはあるか？拡散言語モデルにおける低コスト多様サンプリング手法

要旨

複雑な推論タスク（コード生成や数学的難問解決など）において、効果的な探索のためにはテキスト生成の多様な出力が不可欠である。このようなPass@k問題では、解空間を網羅する異なる候補が有益となる。しかし、従来のサンプリング手法では、繰り返し発生する失敗モードに計算資源が浪費されがちである。拡散言語モデルは、主流の自己回帰モデルに匹敵する選択肢として登場したものの、この冗長性に対して依然脆弱であり、独立したサンプルが類似のモードに収束するケースが頻繁に見られる。この問題に対処するため、我々は拡散言語モデルの生成的多様性を向上させる、訓練不要かつ低コストな介入手法を提案する。本手法は、バッチ内の中間サンプルを逐次的に修正し、各サンプルが前のサンプルの特徴空間から反発するようにすることで、冗長性を積極的に抑制する。再訓練やビームサーチを必要とする従来手法とは異なり、本戦略は無視できる程度の計算オーバーヘッドしか生じさせず、各サンプルがバッチに独自の視点をもたらすことを保証する。我々はLLaDA-8B-Instructモデルを用い、HumanEvalおよびGSM8Kベンチマークで本手法を評価した。その結果、様々な温度設定において、多様性とPass@k性能が大幅に向上することを実証した。サンプリングプロセスへの単純な修正として、本手法は多様な解探索が有益なタスクにおいて、現在及び将来の拡散言語モデルに対し、即時的かつ低コストな改善を提供する。コードはhttps://github.com/sean-lamont/odd で公開している。

English

Diverse outputs in text generation are necessary for effective exploration in complex reasoning tasks, such as code generation and mathematical problem solving. Such Pass@k problems benefit from distinct candidates covering the solution space. However, traditional sampling approaches often waste computational resources on repetitive failure modes. While Diffusion Language Models have emerged as a competitive alternative to the prevailing Autoregressive paradigm, they remain susceptible to this redundancy, with independent samples frequently collapsing into similar modes. To address this, we propose a training free, low cost intervention to enhance generative diversity in Diffusion Language Models. Our approach modifies intermediate samples in a batch sequentially, where each sample is repelled from the feature space of previous samples, actively penalising redundancy. Unlike prior methods that require retraining or beam search, our strategy incurs negligible computational overhead, while ensuring that each sample contributes a unique perspective to the batch. We evaluate our method on the HumanEval and GSM8K benchmarks using the LLaDA-8B-Instruct model. Our results demonstrate significantly improved diversity and Pass@k performance across various temperature settings. As a simple modification to the sampling process, our method offers an immediate, low-cost improvement for current and future Diffusion Language Models in tasks that benefit from diverse solution search. We make our code available at https://github.com/sean-lamont/odd.

Pass@kに無料ランチはあるか？拡散言語モデルにおける低コスト多様サンプリング手法

Free Lunch for Pass@k? Low Cost Diverse Sampling for Diffusion Language Models

要旨

Support