关注生成过程:大语言模型生成中的细粒度置信度估计
Mind the Generation Process: Fine-Grained Confidence Estimation During LLM Generation
August 16, 2025
作者: Jinyi Han, Tingyun Li, Shisong Chen, Jie Shi, Xinyi Wang, Guanglei Yue, Jiaqing Liang, Xin Lin, Liqian Wen, Zulong Chen, Yanghua Xiao
cs.AI
摘要
尽管大型语言模型(LLMs)在多种任务中展现了卓越的性能,它们本质上缺乏自我意识,并常常表现出过度自信,为错误的预测赋予高置信度分数。因此,准确的置信度估计对于提升LLM生成输出的可信度和可靠性至关重要。然而,现有方法受限于粗粒度的评分机制,无法在生成过程中提供细粒度、连续的置信度估计。针对这些局限,我们提出了FineCE,一种新颖的置信度估计方法,能够在文本生成过程中提供精确、细粒度的置信度评分。具体而言,我们首先构建了一个全面的训练数据构建流程,有效捕捉LLM响应的潜在概率分布,然后以监督方式训练模型预测任意文本序列的置信度分数。此外,我们提出了一种后向置信度集成(BCI)策略,利用后续文本的信息来增强推理过程中当前序列的置信度估计。我们还引入了三种策略,用于在生成过程中识别执行置信度估计的最佳位置。在多个基准数据集上的广泛实验表明,FineCE持续优于现有的经典置信度估计方法。我们的代码及论文中使用的所有基线模型均已开源在GitHub上。
English
While large language models (LLMs) have demonstrated remarkable performance
across diverse tasks, they fundamentally lack self-awareness and frequently
exhibit overconfidence, assigning high confidence scores to incorrect
predictions. Accurate confidence estimation is therefore critical for enhancing
the trustworthiness and reliability of LLM-generated outputs. However, existing
approaches suffer from coarse-grained scoring mechanisms that fail to provide
fine-grained, continuous confidence estimates throughout the generation
process. To address these limitations, we introduce FineCE, a novel confidence
estimation method that delivers accurate, fine-grained confidence scores during
text generation. Specifically, we first develop a comprehensive pipeline for
constructing training data that effectively captures the underlying
probabilistic distribution of LLM responses, and then train a model to predict
confidence scores for arbitrary text sequences in a supervised manner.
Furthermore, we propose a Backward Confidence Integration (BCI) strategy that
leverages information from the subsequent text to enhance confidence estimation
for the current sequence during inference. We also introduce three strategies
for identifying optimal positions to perform confidence estimation within the
generation process. Extensive experiments on multiple benchmark datasets
demonstrate that FineCE consistently outperforms existing classical confidence
estimation methods. Our code and all baselines used in the paper are available
on GitHub.