大規模言語モデルはまだ自己修正推論を実現できていない

要旨

大規模言語モデル（LLMs）は、さまざまなアプリケーションにおいて比類のないテキスト生成能力を発揮する画期的な技術として登場しました。しかしながら、生成される内容の正確性と適切性に関する懸念が依然として残っています。これらの問題に対する解決策として、自己修正という現代的な手法が提案されています。この前提に基づき、本論文ではLLMsにおける自己修正の役割と有効性を批判的に検証し、その真の可能性と限界を明らかにします。我々の調査の中心となるのは、外部からのフィードバックに頼ることなく、LLMがその内在的な能力のみに基づいて初期の応答を修正しようとする「内在的自己修正」の概念です。推論の文脈において、我々の研究は、LLMsが外部フィードバックなしに自己修正を行うことに苦戦し、時には自己修正後のパフォーマンスが低下する可能性があることを示しています。これらの知見を踏まえ、今後の研究と実践的な応用に向けた提言を行います。

English

Large Language Models (LLMs) have emerged as a groundbreaking technology with their unparalleled text generation capabilities across various applications. Nevertheless, concerns persist regarding the accuracy and appropriateness of their generated content. A contemporary methodology, self-correction, has been proposed as a remedy to these issues. Building upon this premise, this paper critically examines the role and efficacy of self-correction within LLMs, shedding light on its true potential and limitations. Central to our investigation is the notion of intrinsic self-correction, whereby an LLM attempts to correct its initial responses based solely on its inherent capabilities, without the crutch of external feedback. In the context of reasoning, our research indicates that LLMs struggle to self-correct their responses without external feedback, and at times, their performance might even degrade post self-correction. Drawing from these insights, we offer suggestions for future research and practical applications in this field.

大規模言語モデルはまだ自己修正推論を実現できていない

Large Language Models Cannot Self-Correct Reasoning Yet

要旨

Support