ChatPaper.aiChatPaper

超越离散標本採樣的文本生成

Text Generation Beyond Discrete Token Sampling

May 20, 2025
作者: Yufan Zhuang, Liyuan Liu, Chandan Singh, Jingbo Shang, Jianfeng Gao
cs.AI

摘要

在標準的自回歸生成過程中,大型語言模型(LLM)預測下一個詞元的分布,採樣一個離散的詞元,然後丟棄該分布,僅將採樣的詞元作為新的輸入傳遞。為了保留這一分布的豐富信息,我們提出了輸入混合(Mixture of Inputs, MoI),這是一種無需訓練的自回歸生成方法。在按照標準範式生成一個詞元後,我們構建一個新的輸入,將生成的離散詞元與先前丟棄的詞元分布相結合。具體而言,我們採用了一種貝葉斯估計方法,將詞元分布視為先驗,採樣的詞元作為觀測值,並用連續的後驗期望替代傳統的獨熱向量作為新的模型輸入。MoI使模型在整個生成過程中能夠保持更豐富的內部表示,從而提升文本質量和推理能力。在數學推理、代碼生成和博士級問答任務中,MoI在多個模型(包括QwQ-32B、Nemotron-Super-49B、Gemma-3-27B和DAPO-Qwen-32B)上均能持續提升性能,且無需額外訓練,計算開銷可忽略不計。
English
In standard autoregressive generation, an LLM predicts the next-token distribution, samples a discrete token, and then discards the distribution, passing only the sampled token as new input. To preserve this distribution's rich information, we propose Mixture of Inputs (MoI), a training-free method for autoregressive generation. After generating a token following the standard paradigm, we construct a new input that blends the generated discrete token with the previously discarded token distribution. Specifically, we employ a Bayesian estimation method that treats the token distribution as the prior, the sampled token as the observation, and replaces the conventional one-hot vector with the continuous posterior expectation as the new model input. MoI allows the model to maintain a richer internal representation throughout the generation process, resulting in improved text quality and reasoning capabilities. On mathematical reasoning, code generation, and PhD-level QA tasks, MoI consistently improves performance across multiple models including QwQ-32B, Nemotron-Super-49B, Gemma-3-27B, and DAPO-Qwen-32B, with no additional training and negligible computational overhead.

Summary

AI-Generated Summary

PDF72May 22, 2025