無提示的思維連鎖推理
Chain-of-Thought Reasoning Without Prompting
February 15, 2024
作者: Xuezhi Wang, Denny Zhou
cs.AI
摘要
為了增強大型語言模型(LLMs)的推理能力,先前的研究主要集中在特定提示技術,如少樣本或零樣本的思維鏈提示(CoT)。這些方法雖然有效,但通常需要大量手動提示工程。我們的研究採用了一種新方法,提出了一個問題:LLMs是否可以在沒有提示的情況下有效地進行推理?我們的研究發現,有趣的是,通過簡單地改變解碼過程,可以從預訓練的LLMs中引出CoT推理路徑。我們不再使用傳統的貪婪解碼,而是研究了前k個替代標記,發現這些序列中經常存在CoT路徑。這種方法不僅可以避開提示的混淆因素,還可以讓我們評估LLMs的內在推理能力。此外,我們觀察到,在解碼路徑中存在CoT與模型解碼答案的信心之間存在較高的相關性。這種信心指標有效地區分了CoT和非CoT路徑。對各種推理基準的廣泛實證研究表明,所提出的CoT解碼明顯優於標準的貪婪解碼。
English
In enhancing the reasoning capabilities of large language models (LLMs),
prior research primarily focuses on specific prompting techniques such as
few-shot or zero-shot chain-of-thought (CoT) prompting. These methods, while
effective, often involve manually intensive prompt engineering. Our study takes
a novel approach by asking: Can LLMs reason effectively without prompting? Our
findings reveal that, intriguingly, CoT reasoning paths can be elicited from
pre-trained LLMs by simply altering the decoding process. Rather than
conventional greedy decoding, we investigate the top-k alternative tokens,
uncovering that CoT paths are frequently inherent in these sequences. This
approach not only bypasses the confounders of prompting but also allows us to
assess the LLMs' intrinsic reasoning abilities. Moreover, we observe
that the presence of a CoT in the decoding path correlates with a higher
confidence in the model's decoded answer. This confidence metric effectively
differentiates between CoT and non-CoT paths. Extensive empirical studies on
various reasoning benchmarks show that the proposed CoT-decoding substantially
outperforms the standard greedy decoding.Summary
AI-Generated Summary