自我發現:大型語言模型自我創作推理結構
Self-Discover: Large Language Models Self-Compose Reasoning Structures
February 6, 2024
作者: Pei Zhou, Jay Pujara, Xiang Ren, Xinyun Chen, Heng-Tze Cheng, Quoc V. Le, Ed H. Chi, Denny Zhou, Swaroop Mishra, Huaixiu Steven Zheng
cs.AI
摘要
我們介紹了SELF-DISCOVER,一個通用框架,用於讓LLMs自我發現任務內在的推理結構,以應對對於典型提示方法具有挑戰性的複雜推理問題。該框架的核心是一個自我發現過程,LLMs在其中選擇多個原子推理模塊,如批判性思維和逐步思維,並將它們組合成一個明確的推理結構,供LLMs在解碼期間遵循。SELF-DISCOVER在具有挑戰性的推理基準上,如BigBench-Hard、基於代理的推理和數學等方面,相對於Chain of Thought(CoT),使GPT-4和PaLM 2的性能顯著提高,最高可達32%。此外,SELF-DISCOVER在不需要進行推理計算的方法,如CoT-Self-Consistency,的表現超過20%,同時需要的推理計算量降低了10-40倍。最後,我們展示了自我發現的推理結構在模型家族中具有普遍適用性:從PaLM 2-L到GPT-4,從GPT-4到Llama2,並與人類推理模式具有共通之處。
English
We introduce SELF-DISCOVER, a general framework for LLMs to self-discover the
task-intrinsic reasoning structures to tackle complex reasoning problems that
are challenging for typical prompting methods. Core to the framework is a
self-discovery process where LLMs select multiple atomic reasoning modules such
as critical thinking and step-by-step thinking, and compose them into an
explicit reasoning structure for LLMs to follow during decoding. SELF-DISCOVER
substantially improves GPT-4 and PaLM 2's performance on challenging reasoning
benchmarks such as BigBench-Hard, grounded agent reasoning, and MATH, by as
much as 32% compared to Chain of Thought (CoT). Furthermore, SELF-DISCOVER
outperforms inference-intensive methods such as CoT-Self-Consistency by more
than 20%, while requiring 10-40x fewer inference compute. Finally, we show that
the self-discovered reasoning structures are universally applicable across
model families: from PaLM 2-L to GPT-4, and from GPT-4 to Llama2, and share
commonalities with human reasoning patterns.