Pass@k的免费午餐？扩散语言模型的低成本多样性采样

摘要

在複雜推理任務（如程式碼生成與數學問題求解）中，文本生成的多樣化輸出對於有效探索至關重要。這類Pass@k問題需要覆蓋解空間的差異化解候選方案。然而，傳統採樣方法常因重複的失敗模式而浪費計算資源。儘管擴散語言模型已成為主流自回歸範式的競爭性替代方案，但其獨立樣本仍易坍縮至相似模式，難以規避此類冗餘。為解決此問題，我們提出一種無需訓練、低成本的干預方法，旨在增強擴散語言模型的生成多樣性。該方法對批次中的中間樣本進行序列化修正，使每個樣本在特徵空間中排斥先前行成的樣本，主動懲罰冗餘現象。與需要重新訓練或束搜索的既有方法不同，我們的策略僅產生可忽略的計算開銷，同時確保每個樣本為批次提供獨特視角。我們使用LLaDA-8B-Instruct模型在HumanEval和GSM8K基準上評估本方法。結果表明，在不同溫度設定下，該方法能顯著提升多樣性與Pass@k性能。作為採樣過程的簡單修正，本方法可為當前及未來的擴散語言模型在需要多樣化解搜索的任務中，提供即時、低成本的改進方案。程式碼已開源於：https://github.com/sean-lamont/odd。

English

Diverse outputs in text generation are necessary for effective exploration in complex reasoning tasks, such as code generation and mathematical problem solving. Such Pass@k problems benefit from distinct candidates covering the solution space. However, traditional sampling approaches often waste computational resources on repetitive failure modes. While Diffusion Language Models have emerged as a competitive alternative to the prevailing Autoregressive paradigm, they remain susceptible to this redundancy, with independent samples frequently collapsing into similar modes. To address this, we propose a training free, low cost intervention to enhance generative diversity in Diffusion Language Models. Our approach modifies intermediate samples in a batch sequentially, where each sample is repelled from the feature space of previous samples, actively penalising redundancy. Unlike prior methods that require retraining or beam search, our strategy incurs negligible computational overhead, while ensuring that each sample contributes a unique perspective to the batch. We evaluate our method on the HumanEval and GSM8K benchmarks using the LLaDA-8B-Instruct model. Our results demonstrate significantly improved diversity and Pass@k performance across various temperature settings. As a simple modification to the sampling process, our method offers an immediate, low-cost improvement for current and future Diffusion Language Models in tasks that benefit from diverse solution search. We make our code available at https://github.com/sean-lamont/odd.

Pass@k的免费午餐？扩散语言模型的低成本多样性采样

Free Lunch for Pass@k? Low Cost Diverse Sampling for Diffusion Language Models

摘要

Support