内省的ヒント：インコンテキスト意思決定のための大規模言語モデル

要旨

大規模言語モデル（LLM）の出現は、自然言語処理に大きな影響を与え、さまざまなタスクで卓越した結果を示してきました。本研究では、「内省的ヒント」を用いて、LLMが意思決定を自己最適化することを促進します。軌跡を内省的に検討することで、LLMは簡潔で価値あるヒントを生成し、そのポリシーを洗練させます。私たちの手法は、エージェントの過去の経験から学ぶこと、専門家のデモンストレーションを統合すること、多様なゲームにわたって一般化することという3つの重要なシナリオを考慮することで、少数ショット学習およびゼロショット学習の状況におけるエージェントのパフォーマンスを向上させます。重要なのは、LLMのパラメータを微調整するのではなく、プロンプトを調整して前述の3つの状況からの洞察を一般化することです。私たちのフレームワークは、LLMをコンテキスト内意思決定に活用する利点を支持し、強調します。TextWorldにおける100以上のゲームを対象とした実験は、私たちのアプローチの優れたパフォーマンスを示しています。

English

The emergence of large language models (LLMs) has substantially influenced natural language processing, demonstrating exceptional results across various tasks. In this study, we employ ``Introspective Tips" to facilitate LLMs in self-optimizing their decision-making. By introspectively examining trajectories, LLM refines its policy by generating succinct and valuable tips. Our method enhances the agent's performance in both few-shot and zero-shot learning situations by considering three essential scenarios: learning from the agent's past experiences, integrating expert demonstrations, and generalizing across diverse games. Importantly, we accomplish these improvements without fine-tuning the LLM parameters; rather, we adjust the prompt to generalize insights from the three aforementioned situations. Our framework not only supports but also emphasizes the advantage of employing LLM in in-contxt decision-making. Experiments involving over 100 games in TextWorld illustrate the superior performance of our approach.

内省的ヒント：インコンテキスト意思決定のための大規模言語モデル

Introspective Tips: Large Language Model for In-Context Decision Making

要旨

Support