ChatPaper.aiChatPaper

HoT:用於從輸入中引用支持事實的突出思維鏈

HoT: Highlighted Chain of Thought for Referencing Supporting Facts from Inputs

March 3, 2025
作者: Tin Nguyen, Logan Bolton, Mohammad Reza Taesiri, Anh Totti Nguyen
cs.AI

摘要

大型語言模型(LLMs)的一個致命弱點在於其傾向於產生虛構的非事實陳述。這種混合了事實與非事實的回應,對人類而言,在驗證和基於這些信息做出準確決策時構成了挑戰。為解決這一問題,我們提出了「高亮思維鏈提示法」(Highlighted Chain-of-Thought Prompting, HoT),這是一種引導LLMs生成帶有XML標籤回應的技術,這些標籤將事實與查詢中提供的信息相錨定。具體而言,給定一個輸入問題,LLMs首先會重新格式化問題,加入XML標籤以突出關鍵事實,隨後生成回應,並在引用自輸入的事實上進行高亮顯示。有趣的是,在少樣本設置下,HoT在從算術、閱讀理解到邏輯推理的17項廣泛任務上,均優於基礎的思維鏈提示法(CoT)。當要求人類驗證LLM的回應時,高亮顯示幫助時間有限的參與者更準確且高效地識別出LLM何時正確。然而,令人驚訝的是,當LLM出錯時,HoT往往會讓用戶誤以為答案是正確的。
English
An Achilles heel of Large Language Models (LLMs) is their tendency to hallucinate non-factual statements. A response mixed of factual and non-factual statements poses a challenge for humans to verify and accurately base their decisions on. To combat this problem, we propose Highlighted Chain-of-Thought Prompting (HoT), a technique for prompting LLMs to generate responses with XML tags that ground facts to those provided in the query. That is, given an input question, LLMs would first re-format the question to add XML tags highlighting key facts, and then, generate a response with highlights over the facts referenced from the input. Interestingly, in few-shot settings, HoT outperforms vanilla chain of thought prompting (CoT) on a wide range of 17 tasks from arithmetic, reading comprehension to logical reasoning. When asking humans to verify LLM responses, highlights help time-limited participants to more accurately and efficiently recognize when LLMs are correct. Yet, surprisingly, when LLMs are wrong, HoTs tend to make users believe that an answer is correct.