자동화된 사실 확인에서 불확실성의 원인 분석

초록

모델의 예측에 대한 불확실성의 원인을 이해하는 것은 효과적인 인간-AI 협업에 있어 매우 중요합니다. 기존 연구에서는 수치적 불확실성이나 완곡어법("잘 모르겠지만...")을 사용하는 것을 제안했지만, 이는 상충되는 증거로 인해 발생하는 불확실성을 설명하지 못해 사용자가 의견 불일치를 해결하거나 출력에 의존할 수 없게 합니다. 우리는 CLUE(Conflict-and-Agreement-aware Language-model Uncertainty Explanations)를 소개합니다. 이는 (i) 비지도 방식으로 모델의 예측 불확실성을 유발하는 주장-증거 또는 증거 간의 상충 및 일치를 드러내는 텍스트 스팬 간의 관계를 식별하고, (ii) 이러한 중요한 상호작용을 언어화하는 설명을 프롬프팅과 주의 지향을 통해 생성하는 최초의 프레임워크입니다. 세 가지 언어 모델과 두 가지 팩트 체크 데이터셋에서, CLUE는 스팬 상호작용 지침 없이 불확실성 설명을 프롬프팅하는 것보다 모델의 불확실성에 더 충실하고 팩트 체크 결정과 더 일치하는 설명을 생성함을 보여줍니다. 인간 평가자들은 우리의 설명이 이 기준선보다 더 도움이 되고, 더 유익하며, 덜 중복되고, 입력과 더 논리적으로 일치한다고 판단했습니다. CLUE는 미세 조정이나 아키텍처 변경이 필요 없어 어떤 화이트박스 언어 모델에도 플러그 앤 플레이 방식으로 적용할 수 있습니다. 불확실성을 증거 상충과 명시적으로 연결함으로써, 팩트 체크를 위한 실질적인 지원을 제공하며 복잡한 정보에 대한 추론이 필요한 다른 작업으로도 쉽게 일반화됩니다.

English

Understanding sources of a model's uncertainty regarding its predictions is crucial for effective human-AI collaboration. Prior work proposes using numerical uncertainty or hedges ("I'm not sure, but ..."), which do not explain uncertainty that arises from conflicting evidence, leaving users unable to resolve disagreements or rely on the output. We introduce CLUE (Conflict-and-Agreement-aware Language-model Uncertainty Explanations), the first framework to generate natural language explanations of model uncertainty by (i) identifying relationships between spans of text that expose claim-evidence or inter-evidence conflicts and agreements that drive the model's predictive uncertainty in an unsupervised way, and (ii) generating explanations via prompting and attention steering that verbalize these critical interactions. Across three language models and two fact-checking datasets, we show that CLUE produces explanations that are more faithful to the model's uncertainty and more consistent with fact-checking decisions than prompting for uncertainty explanations without span-interaction guidance. Human evaluators judge our explanations to be more helpful, more informative, less redundant, and more logically consistent with the input than this baseline. CLUE requires no fine-tuning or architectural changes, making it plug-and-play for any white-box language model. By explicitly linking uncertainty to evidence conflicts, it offers practical support for fact-checking and generalises readily to other tasks that require reasoning over complex information.

자동화된 사실 확인에서 불확실성의 원인 분석

Explaining Sources of Uncertainty in Automated Fact-Checking

초록

Support