大型語言模型中的查詢層級不確定性

摘要

大型語言模型必須清楚其知識的邊界，以及識別已知與未知查詢的機制。此類意識有助於模型執行適應性推理，例如啟用檢索增強生成（RAG）、進行深入且緩慢的思考，或採用棄權機制，這對於開發高效且可信賴的人工智慧至關重要。在本研究中，我們提出了一種透過查詢層級不確定性來檢測知識邊界的方法，旨在確定模型是否能在不生成任何詞元的情況下處理特定查詢。為此，我們引入了一種新穎且無需訓練的方法，稱為內部置信度，該方法利用跨層次與詞元的自我評估。在事實問答與數學推理任務上的實證結果表明，我們的內部置信度能夠超越多種基準方法。此外，我們展示了所提出的方法可用於高效的RAG與模型級聯，這能在保持性能的同時降低推理成本。

English

It is important for Large Language Models to be aware of the boundary of their knowledge, the mechanism of identifying known and unknown queries. This type of awareness can help models perform adaptive inference, such as invoking RAG, engaging in slow and deep thinking, or adopting the abstention mechanism, which is beneficial to the development of efficient and trustworthy AI. In this work, we propose a method to detect knowledge boundaries via Query-Level Uncertainty, which aims to determine if the model is able to address a given query without generating any tokens. To this end, we introduce a novel and training-free method called Internal Confidence, which leverages self-evaluations across layers and tokens. Empirical results on both factual QA and mathematical reasoning tasks demonstrate that our internal confidence can outperform several baselines. Furthermore, we showcase that our proposed method can be used for efficient RAG and model cascading, which is able to reduce inference costs while maintaining performance.

大型語言模型中的查詢層級不確定性

Query-Level Uncertainty in Large Language Models

摘要

Support