ChatPaper.aiChatPaper

自我发现:大型语言模型自我构建推理结构

Self-Discover: Large Language Models Self-Compose Reasoning Structures

February 6, 2024
作者: Pei Zhou, Jay Pujara, Xiang Ren, Xinyun Chen, Heng-Tze Cheng, Quoc V. Le, Ed H. Chi, Denny Zhou, Swaroop Mishra, Huaixiu Steven Zheng
cs.AI

摘要

我们介绍了SELF-DISCOVER,这是一个通用框架,用于让LLMs自我发现任务内在推理结构,以解决对典型提示方法具有挑战性的复杂推理问题。该框架的核心是一个自我发现过程,在这个过程中,LLMs选择多个原子推理模块,如批判性思维和逐步思维,并将它们组合成一个明确的推理结构,供LLMs在解码过程中遵循。SELF-DISCOVER显著提高了GPT-4和PaLM 2在具有挑战性的推理基准上的表现,如BigBench-Hard、基于代理的推理和数学问题,相比于Chain of Thought (CoT)高达32%。此外,SELF-DISCOVER在不需要大量推理计算的情况下,比如CoT-Self-Consistency,表现优于推理密集型方法超过20倍。最后,我们展示了自我发现的推理结构在模型系列中具有普遍适用性:从PaLM 2-L到GPT-4,从GPT-4到Llama2,并且与人类推理模式有共同之处。
English
We introduce SELF-DISCOVER, a general framework for LLMs to self-discover the task-intrinsic reasoning structures to tackle complex reasoning problems that are challenging for typical prompting methods. Core to the framework is a self-discovery process where LLMs select multiple atomic reasoning modules such as critical thinking and step-by-step thinking, and compose them into an explicit reasoning structure for LLMs to follow during decoding. SELF-DISCOVER substantially improves GPT-4 and PaLM 2's performance on challenging reasoning benchmarks such as BigBench-Hard, grounded agent reasoning, and MATH, by as much as 32% compared to Chain of Thought (CoT). Furthermore, SELF-DISCOVER outperforms inference-intensive methods such as CoT-Self-Consistency by more than 20%, while requiring 10-40x fewer inference compute. Finally, we show that the self-discovered reasoning structures are universally applicable across model families: from PaLM 2-L to GPT-4, and from GPT-4 to Llama2, and share commonalities with human reasoning patterns.
PDF11610December 15, 2024