ChatPaper.aiChatPaper

多轮交互中大型语言模型的置信度估计

Confidence Estimation for LLMs in Multi-turn Interactions

January 5, 2026
作者: Caiqi Zhang, Ruihan Yang, Xiaochen Zhu, Chengzu Li, Tiancheng Hu, Yijiang River Dong, Deqing Yang, Nigel Collier
cs.AI

摘要

儘管信心估計是緩解大型語言模型幻覺現象的潛在方向,現有研究主要聚焦於單輪對話場景。在多輪對話中,模型信心的動態變化機制——即語境持續累積且模糊性逐步消解的過程——仍屬探索不足的領域。可靠的多輪信心估計對自主智能體、人機協同系統等下游應用至關重要。本研究首次對多輪交互中的信心估計進行系統性探討,基於兩大核心訴求建立形式化評估框架:單輪校準度與信息增量下的信心單調性。為此我們引入創新指標(包括長度歸一化的預期校準誤差InfoECE)及受控評估數據集生成的「提示者-猜測者」新範式。實驗表明,主流信心估計技術在多輪對話中難以維持校準度與單調性。我們提出的基於邏輯值的探針P(Sufficient)雖取得相對更優性能,但該任務遠未徹底解決。本研究成果為開發更可靠、可信的對話智能體奠定了方法論基礎。
English
While confidence estimation is a promising direction for mitigating hallucinations in Large Language Models (LLMs), current research dominantly focuses on single-turn settings. The dynamics of model confidence in multi-turn conversations, where context accumulates and ambiguity is progressively resolved, remain largely unexplored. Reliable confidence estimation in multi-turn settings is critical for many downstream applications, such as autonomous agents and human-in-the-loop systems. This work presents the first systematic study of confidence estimation in multi-turn interactions, establishing a formal evaluation framework grounded in two key desiderata: per-turn calibration and monotonicity of confidence as more information becomes available. To facilitate this, we introduce novel metrics, including a length-normalized Expected Calibration Error (InfoECE), and a new "Hinter-Guesser" paradigm for generating controlled evaluation datasets. Our experiments reveal that widely-used confidence techniques struggle with calibration and monotonicity in multi-turn dialogues. We propose P(Sufficient), a logit-based probe that achieves comparatively better performance, although the task remains far from solved. Our work provides a foundational methodology for developing more reliable and trustworthy conversational agents.
PDF61January 7, 2026