自動化事實查核中不確定性來源的解釋

摘要

理解模型對其預測不確定性的來源，對於實現有效的人機協作至關重要。先前的研究提出了使用數值不確定性或模糊語句（如「我不確定，但……」）的方法，這些方法並未解釋由證據衝突引起的不確定性，使得用戶無法解決分歧或依賴輸出結果。我們引入了CLUE（基於衝突與一致性的語言模型不確定性解釋框架），這是首個通過以下方式生成模型不確定性自然語言解釋的框架：(i) 以無監督方式識別文本片段之間的關係，這些關係揭示了導致模型預測不確定性的主張-證據或證據間衝突與一致性；(ii) 通過提示和注意力引導生成解釋，將這些關鍵互動以語言形式表達出來。在三個語言模型和兩個事實核查數據集上的實驗表明，與未提供片段互動指導的不確定性解釋提示相比，CLUE生成的解釋更忠實於模型的不確定性，且與事實核查決策更為一致。人類評估者認為我們的解釋比基準方法更有幫助、信息更豐富、冗余更少，並且與輸入的邏輯一致性更高。CLUE無需微調或架構修改，使其能夠即插即用於任何白盒語言模型。通過明確將不確定性與證據衝突聯繫起來，它為事實核查提供了實用支持，並能輕鬆推廣到其他需要對複雜信息進行推理的任務中。

English

Understanding sources of a model's uncertainty regarding its predictions is crucial for effective human-AI collaboration. Prior work proposes using numerical uncertainty or hedges ("I'm not sure, but ..."), which do not explain uncertainty that arises from conflicting evidence, leaving users unable to resolve disagreements or rely on the output. We introduce CLUE (Conflict-and-Agreement-aware Language-model Uncertainty Explanations), the first framework to generate natural language explanations of model uncertainty by (i) identifying relationships between spans of text that expose claim-evidence or inter-evidence conflicts and agreements that drive the model's predictive uncertainty in an unsupervised way, and (ii) generating explanations via prompting and attention steering that verbalize these critical interactions. Across three language models and two fact-checking datasets, we show that CLUE produces explanations that are more faithful to the model's uncertainty and more consistent with fact-checking decisions than prompting for uncertainty explanations without span-interaction guidance. Human evaluators judge our explanations to be more helpful, more informative, less redundant, and more logically consistent with the input than this baseline. CLUE requires no fine-tuning or architectural changes, making it plug-and-play for any white-box language model. By explicitly linking uncertainty to evidence conflicts, it offers practical support for fact-checking and generalises readily to other tasks that require reasoning over complex information.

自動化事實查核中不確定性來源的解釋

Explaining Sources of Uncertainty in Automated Fact-Checking

摘要

Support