信じるか信じないか、あなたのLLM次第

要旨

大規模言語モデル（LLM）における不確実性の定量化について探求し、クエリに対する応答の不確実性が大きい場合を特定することを目指します。本論文では、認識論的不確実性（エピステミック不確実性）と偶然的不確実性（アレトーリック不確実性）の両方を同時に考慮します。前者は、真実（事実や言語に関する知識など）の欠如に起因し、後者は還元不可能なランダム性（複数の可能な回答など）に起因します。特に、認識論的不確実性のみが大きい場合を確実に検出する情報理論的指標を導出します。この場合、モデルの出力は信頼性が低いと判断されます。この条件は、前の応答に基づいた特別な反復的プロンプティングによって得られたモデルの出力のみに基づいて計算可能です。例えば、この定量化により、単一回答および複数回答の両方において、幻覚（認識論的不確実性が高い場合）を検出することができます。これは、多くの標準的な不確実性定量化戦略（例えば、応答の対数尤度を閾値処理する方法）とは対照的であり、複数回答の場合の幻覚を検出できないという問題を解決します。一連の実験を通じて、本手法の優位性を実証します。さらに、LLMが特定の出力に割り当てる確率が反復的プロンプティングによってどのように増幅されるかについても明らかにし、これは独立した興味深い知見となる可能性があります。

English

We explore uncertainty quantification in large language models (LLMs), with the goal to identify when uncertainty in responses given a query is large. We simultaneously consider both epistemic and aleatoric uncertainties, where the former comes from the lack of knowledge about the ground truth (such as about facts or the language), and the latter comes from irreducible randomness (such as multiple possible answers). In particular, we derive an information-theoretic metric that allows to reliably detect when only epistemic uncertainty is large, in which case the output of the model is unreliable. This condition can be computed based solely on the output of the model obtained simply by some special iterative prompting based on the previous responses. Such quantification, for instance, allows to detect hallucinations (cases when epistemic uncertainty is high) in both single- and multi-answer responses. This is in contrast to many standard uncertainty quantification strategies (such as thresholding the log-likelihood of a response) where hallucinations in the multi-answer case cannot be detected. We conduct a series of experiments which demonstrate the advantage of our formulation. Further, our investigations shed some light on how the probabilities assigned to a given output by an LLM can be amplified by iterative prompting, which might be of independent interest.

信じるか信じないか、あなたのLLM次第

To Believe or Not to Believe Your LLM

要旨

Support