ChatPaper.aiChatPaper

何時進行集成:識別穩定且快速的大型語言模型集成中的詞元級關鍵點

When to Ensemble: Identifying Token-Level Points for Stable and Fast LLM Ensembling

October 17, 2025
作者: Heecheol Yun, Kwangmin Ki, Junghyun Lee, Eunho Yang
cs.AI

摘要

集成大型語言模型(LLMs)作為一種超越單一模型性能的潛在方法,因其能夠利用各模型的互補優勢而受到關注。特別地,通過聚合模型的下一個詞元概率分佈來選擇下一個詞元,已在多種任務中顯示出有效性。然而,儘管在短篇回答生成中取得了成功,其在長篇生成中的應用仍未被充分探索。本文指出,在長篇生成中使用現有的集成方法需要謹慎選擇集成位置,因為在每個詞元處進行集成的標準做法往往會降低性能。我們確定了決定這些位置的兩個關鍵因素:模型間的詞元化不匹配以及它們在下一個詞元概率分佈上的共識。基於此,我們提出了SAFE(穩定且快速的LLM集成框架),該框架通過綜合考慮這些因素來進行選擇性集成。為了進一步提升穩定性,我們引入了一種概率銳化策略,該策略將分散在代表同一詞的多個子詞詞元上的概率合併到一個代表性詞元中。我們在包括MATH500和BBH在內的多樣化基準測試中的實驗表明,SAFE在準確性和效率上均優於現有方法,即使在集成少於1%的詞元時也能實現性能提升。
English
Ensembling Large Language Models (LLMs) has gained attention as a promising approach to surpass the performance of individual models by leveraging their complementary strengths. In particular, aggregating models' next-token probability distributions to select the next token has been shown to be effective in various tasks. However, while successful for short-form answers, its application to long-form generation remains underexplored. In this paper, we show that using existing ensemble methods in long-form generation requires a careful choice of ensembling positions, since the standard practice of ensembling at every token often degrades performance. We identify two key factors for determining these positions: tokenization mismatch across models and consensus in their next-token probability distributions. Based on this, we propose SAFE, (Stable And Fast LLM Ensembling), a framework that selectively ensembles by jointly considering these factors. To further improve stability, we introduce a probability sharpening strategy that consolidates probabilities spread across multiple sub-word tokens representing the same word into a single representative token. Our experiments on diverse benchmarks, including MATH500 and BBH, demonstrate that SAFE outperforms existing methods in both accuracy and efficiency, with gains achieved even when ensembling fewer than 1% of tokens.
PDF283October 21, 2025