相信還是不相信你的LLM

摘要

我們探討大型語言模型（LLMs）中的不確定性量化，旨在識別在給定查詢時回應的不確定性何時較大。我們同時考慮認識不確定性和隨機不確定性，前者源於對真實情況（例如事實或語言）的缺乏知識，後者源於不可減少的隨機性（例如多個可能的答案）。特別是，我們推導出一個信息理論度量，可以可靠地檢測僅當認識不確定性較大時，模型的輸出是不可靠的。這個條件可以僅基於模型的輸出計算，而這些輸出僅通過一些基於先前回應的特殊迭代提示獲得。這種量化，例如，可以檢測單一和多個答案回應中的幻覺（當認識不確定性高時的情況）。這與許多標準不確定性量化策略（例如將回應的對數似然度閾值化）形成對比，在多個答案情況下無法檢測到幻覺。我們進行了一系列實驗，展示了我們公式的優勢。此外，我們的研究揭示了大型語言模型對特定輸出分配的概率如何可以通過迭代提示放大，這可能具有獨立的興趣。

English

We explore uncertainty quantification in large language models (LLMs), with the goal to identify when uncertainty in responses given a query is large. We simultaneously consider both epistemic and aleatoric uncertainties, where the former comes from the lack of knowledge about the ground truth (such as about facts or the language), and the latter comes from irreducible randomness (such as multiple possible answers). In particular, we derive an information-theoretic metric that allows to reliably detect when only epistemic uncertainty is large, in which case the output of the model is unreliable. This condition can be computed based solely on the output of the model obtained simply by some special iterative prompting based on the previous responses. Such quantification, for instance, allows to detect hallucinations (cases when epistemic uncertainty is high) in both single- and multi-answer responses. This is in contrast to many standard uncertainty quantification strategies (such as thresholding the log-likelihood of a response) where hallucinations in the multi-answer case cannot be detected. We conduct a series of experiments which demonstrate the advantage of our formulation. Further, our investigations shed some light on how the probabilities assigned to a given output by an LLM can be amplified by iterative prompting, which might be of independent interest.

相信還是不相信你的LLM

To Believe or Not to Believe Your LLM

摘要

Support