无提示的思维链推理

Chain-of-Thought Reasoning Without Prompting

February 15, 2024

作者: Xuezhi Wang, Denny Zhou

cs.AI

摘要

在增强大型语言模型（LLMs）的推理能力方面，先前的研究主要集中在特定提示技术，如少样本或零样本的链式思维（CoT）提示。这些方法虽然有效，但通常需要大量手工提示工程。我们的研究采取了一种新颖的方法，提出了一个问题：LLMs是否可以在没有提示的情况下有效推理？我们的发现显示，有趣的是，通过简单改变解码过程，可以从预训练的LLMs中引出CoT推理路径。我们研究了基于前k个替代标记的非贪婪解码，发现这些序列中经常固有地存在CoT路径。这种方法不仅绕过了提示的混淆因素，还使我们能够评估LLMs的内在推理能力。此外，我们观察到，在解码路径中存在CoT与模型解码答案的置信度更高相关。这种置信度度量有效区分了CoT和非CoT路径。对各种推理基准的广泛实证研究表明，所提出的CoT解码明显优于标准的贪婪解码。

English

In enhancing the reasoning capabilities of large language models (LLMs), prior research primarily focuses on specific prompting techniques such as few-shot or zero-shot chain-of-thought (CoT) prompting. These methods, while effective, often involve manually intensive prompt engineering. Our study takes a novel approach by asking: Can LLMs reason effectively without prompting? Our findings reveal that, intriguingly, CoT reasoning paths can be elicited from pre-trained LLMs by simply altering the decoding process. Rather than conventional greedy decoding, we investigate the top-k alternative tokens, uncovering that CoT paths are frequently inherent in these sequences. This approach not only bypasses the confounders of prompting but also allows us to assess the LLMs' intrinsic reasoning abilities. Moreover, we observe that the presence of a CoT in the decoding path correlates with a higher confidence in the model's decoded answer. This confidence metric effectively differentiates between CoT and non-CoT paths. Extensive empirical studies on various reasoning benchmarks show that the proposed CoT-decoding substantially outperforms the standard greedy decoding.