ChatPaper.aiChatPaper

超越离散标记采样的文本生成

Text Generation Beyond Discrete Token Sampling

May 20, 2025
作者: Yufan Zhuang, Liyuan Liu, Chandan Singh, Jingbo Shang, Jianfeng Gao
cs.AI

摘要

在标准的自回归生成过程中,大型语言模型(LLM)预测下一个词元的分布,采样一个离散词元,随后丢弃该分布,仅将采样得到的词元作为新的输入传递。为了保留这一分布所蕴含的丰富信息,我们提出了“输入混合”(Mixture of Inputs, MoI),一种无需训练的自回归生成方法。在按照标准范式生成一个词元后,我们构建一个新的输入,将生成的离散词元与先前被丢弃的词元分布相结合。具体而言,我们采用贝叶斯估计方法,将词元分布视为先验,采样词元作为观测值,并用连续的后验期望替代传统的一热向量,作为模型的新输入。MoI使得模型在整个生成过程中能够维持更丰富的内部表示,从而提升文本质量和推理能力。在数学推理、代码生成及博士级问答任务上,MoI在包括QwQ-32B、Nemotron-Super-49B、Gemma-3-27B和DAPO-Qwen-32B在内的多个模型中均一致提升了性能,且无需额外训练,计算开销微乎其微。
English
In standard autoregressive generation, an LLM predicts the next-token distribution, samples a discrete token, and then discards the distribution, passing only the sampled token as new input. To preserve this distribution's rich information, we propose Mixture of Inputs (MoI), a training-free method for autoregressive generation. After generating a token following the standard paradigm, we construct a new input that blends the generated discrete token with the previously discarded token distribution. Specifically, we employ a Bayesian estimation method that treats the token distribution as the prior, the sampled token as the observation, and replaces the conventional one-hot vector with the continuous posterior expectation as the new model input. MoI allows the model to maintain a richer internal representation throughout the generation process, resulting in improved text quality and reasoning capabilities. On mathematical reasoning, code generation, and PhD-level QA tasks, MoI consistently improves performance across multiple models including QwQ-32B, Nemotron-Super-49B, Gemma-3-27B, and DAPO-Qwen-32B, with no additional training and negligible computational overhead.

Summary

AI-Generated Summary

PDF72May 22, 2025