Self-Discover: 大規模言語モデルが自己生成する推論構造

要旨

我々はSELF-DISCOVERを紹介する。これは、複雑な推論問題に取り組むために、LLMがタスク固有の推論構造を自己発見するための汎用フレームワークである。従来のプロンプト手法では困難な問題に対処するため、このフレームワークの中核となるのは、LLMが批判的思考や段階的思考といった複数の基本的な推論モジュールを選択し、それらを明示的な推論構造として組み立てる自己発見プロセスである。SELF-DISCOVERは、BigBench-Hard、グラウンデッドエージェント推論、MATHといった挑戦的な推論ベンチマークにおいて、GPT-4とPaLM 2の性能をChain of Thought (CoT)と比べて最大32%向上させた。さらに、SELF-DISCOVERはCoT-Self-Consistencyのような推論集約型の手法を20%以上上回りながら、推論計算量を10～40分の1に削減する。最後に、自己発見された推論構造が、PaLM 2-LからGPT-4、GPT-4からLlama2といったモデルファミリー間で普遍的に適用可能であり、人間の推論パターンと共通点を持つことを示す。

English

We introduce SELF-DISCOVER, a general framework for LLMs to self-discover the task-intrinsic reasoning structures to tackle complex reasoning problems that are challenging for typical prompting methods. Core to the framework is a self-discovery process where LLMs select multiple atomic reasoning modules such as critical thinking and step-by-step thinking, and compose them into an explicit reasoning structure for LLMs to follow during decoding. SELF-DISCOVER substantially improves GPT-4 and PaLM 2's performance on challenging reasoning benchmarks such as BigBench-Hard, grounded agent reasoning, and MATH, by as much as 32% compared to Chain of Thought (CoT). Furthermore, SELF-DISCOVER outperforms inference-intensive methods such as CoT-Self-Consistency by more than 20%, while requiring 10-40x fewer inference compute. Finally, we show that the self-discovered reasoning structures are universally applicable across model families: from PaLM 2-L to GPT-4, and from GPT-4 to Llama2, and share commonalities with human reasoning patterns.

Self-Discover: 大規模言語モデルが自己生成する推論構造

Self-Discover: Large Language Models Self-Compose Reasoning Structures

要旨

Support