无提示的思维链推理
Chain-of-Thought Reasoning Without Prompting
February 15, 2024
作者: Xuezhi Wang, Denny Zhou
cs.AI
摘要
在增强大型语言模型(LLMs)的推理能力方面,先前的研究主要集中在特定提示技术,如少样本或零样本的链式思维(CoT)提示。这些方法虽然有效,但通常需要大量手工提示工程。我们的研究采取了一种新颖的方法,提出了一个问题:LLMs是否可以在没有提示的情况下有效推理?我们的发现显示,有趣的是,通过简单改变解码过程,可以从预训练的LLMs中引出CoT推理路径。我们研究了基于前k个替代标记的非贪婪解码,发现这些序列中经常固有地存在CoT路径。这种方法不仅绕过了提示的混淆因素,还使我们能够评估LLMs的内在推理能力。此外,我们观察到,在解码路径中存在CoT与模型解码答案的置信度更高相关。这种置信度度量有效区分了CoT和非CoT路径。对各种推理基准的广泛实证研究表明,所提出的CoT解码明显优于标准的贪婪解码。
English
In enhancing the reasoning capabilities of large language models (LLMs),
prior research primarily focuses on specific prompting techniques such as
few-shot or zero-shot chain-of-thought (CoT) prompting. These methods, while
effective, often involve manually intensive prompt engineering. Our study takes
a novel approach by asking: Can LLMs reason effectively without prompting? Our
findings reveal that, intriguingly, CoT reasoning paths can be elicited from
pre-trained LLMs by simply altering the decoding process. Rather than
conventional greedy decoding, we investigate the top-k alternative tokens,
uncovering that CoT paths are frequently inherent in these sequences. This
approach not only bypasses the confounders of prompting but also allows us to
assess the LLMs' intrinsic reasoning abilities. Moreover, we observe
that the presence of a CoT in the decoding path correlates with a higher
confidence in the model's decoded answer. This confidence metric effectively
differentiates between CoT and non-CoT paths. Extensive empirical studies on
various reasoning benchmarks show that the proposed CoT-decoding substantially
outperforms the standard greedy decoding.