自我發現：大型語言模型自我創作推理結構

摘要

我們介紹了SELF-DISCOVER，一個通用框架，用於讓LLMs自我發現任務內在的推理結構，以應對對於典型提示方法具有挑戰性的複雜推理問題。該框架的核心是一個自我發現過程，LLMs在其中選擇多個原子推理模塊，如批判性思維和逐步思維，並將它們組合成一個明確的推理結構，供LLMs在解碼期間遵循。SELF-DISCOVER在具有挑戰性的推理基準上，如BigBench-Hard、基於代理的推理和數學等方面，相對於Chain of Thought（CoT），使GPT-4和PaLM 2的性能顯著提高，最高可達32%。此外，SELF-DISCOVER在不需要進行推理計算的方法，如CoT-Self-Consistency，的表現超過20%，同時需要的推理計算量降低了10-40倍。最後，我們展示了自我發現的推理結構在模型家族中具有普遍適用性：從PaLM 2-L到GPT-4，從GPT-4到Llama2，並與人類推理模式具有共通之處。

English

We introduce SELF-DISCOVER, a general framework for LLMs to self-discover the task-intrinsic reasoning structures to tackle complex reasoning problems that are challenging for typical prompting methods. Core to the framework is a self-discovery process where LLMs select multiple atomic reasoning modules such as critical thinking and step-by-step thinking, and compose them into an explicit reasoning structure for LLMs to follow during decoding. SELF-DISCOVER substantially improves GPT-4 and PaLM 2's performance on challenging reasoning benchmarks such as BigBench-Hard, grounded agent reasoning, and MATH, by as much as 32% compared to Chain of Thought (CoT). Furthermore, SELF-DISCOVER outperforms inference-intensive methods such as CoT-Self-Consistency by more than 20%, while requiring 10-40x fewer inference compute. Finally, we show that the self-discovered reasoning structures are universally applicable across model families: from PaLM 2-L to GPT-4, and from GPT-4 to Llama2, and share commonalities with human reasoning patterns.

自我發現：大型語言模型自我創作推理結構

Self-Discover: Large Language Models Self-Compose Reasoning Structures

摘要

Support