關注生成過程:大語言模型生成中的細粒度置信度估計
Mind the Generation Process: Fine-Grained Confidence Estimation During LLM Generation
August 16, 2025
作者: Jinyi Han, Tingyun Li, Shisong Chen, Jie Shi, Xinyi Wang, Guanglei Yue, Jiaqing Liang, Xin Lin, Liqian Wen, Zulong Chen, Yanghua Xiao
cs.AI
摘要
儘管大型語言模型(LLMs)在多樣化的任務中展現了卓越的性能,但其本質上缺乏自我意識,並經常表現出過度自信,對錯誤的預測賦予高置信度分數。因此,準確的置信度估計對於提升LLM生成輸出的可信度與可靠性至關重要。然而,現有方法受限於粗粒度的評分機制,無法在生成過程中提供細粒度、連續的置信度估計。為解決這些限制,我們提出了FineCE,一種新穎的置信度估計方法,能在文本生成過程中提供精確、細粒度的置信度分數。具體而言,我們首先開發了一個全面的訓練數據構建流程,有效捕捉LLM回應的潛在概率分佈,然後以監督方式訓練模型來預測任意文本序列的置信度分數。此外,我們提出了一種反向置信度集成(BCI)策略,利用後續文本的信息來增強推理過程中當前序列的置信度估計。我們還引入了三種策略,用於在生成過程中識別進行置信度估計的最佳位置。在多個基準數據集上的廣泛實驗表明,FineCE在一致性上超越了現有的經典置信度估計方法。我們的代碼及論文中使用的所有基線模型均已開源於GitHub。
English
While large language models (LLMs) have demonstrated remarkable performance
across diverse tasks, they fundamentally lack self-awareness and frequently
exhibit overconfidence, assigning high confidence scores to incorrect
predictions. Accurate confidence estimation is therefore critical for enhancing
the trustworthiness and reliability of LLM-generated outputs. However, existing
approaches suffer from coarse-grained scoring mechanisms that fail to provide
fine-grained, continuous confidence estimates throughout the generation
process. To address these limitations, we introduce FineCE, a novel confidence
estimation method that delivers accurate, fine-grained confidence scores during
text generation. Specifically, we first develop a comprehensive pipeline for
constructing training data that effectively captures the underlying
probabilistic distribution of LLM responses, and then train a model to predict
confidence scores for arbitrary text sequences in a supervised manner.
Furthermore, we propose a Backward Confidence Integration (BCI) strategy that
leverages information from the subsequent text to enhance confidence estimation
for the current sequence during inference. We also introduce three strategies
for identifying optimal positions to perform confidence estimation within the
generation process. Extensive experiments on multiple benchmark datasets
demonstrate that FineCE consistently outperforms existing classical confidence
estimation methods. Our code and all baselines used in the paper are available
on GitHub.