SOTOPIA-π: 社会的知能を持つ言語エージェントのインタラクティブ学習

要旨

人間は、模倣と社会的相互作用を通じて社会的スキルを学習する。この社会的学習プロセスは、既存の言語エージェント構築に関する研究において十分に検討されていない。このギャップに動機づけられ、我々は対話型学習手法「SOTOPIA-pi」を提案し、言語エージェントの社会的知性を向上させる。この手法は、大規模言語モデル（LLM）の評価に基づいてフィルタリングされた社会的相互作用データに対して、行動クローニングと自己強化学習を活用する。我々のトレーニング手法により、7BのLLMが専門モデル（GPT-4ベースのエージェント）の社会的目標達成能力に到達しつつ、言語エージェントの安全性を向上させ、MMLUベンチマークにおける一般的なQA能力を維持できることを示す。また、このトレーニングパラダイムは、LLMベースの社会的知性評価におけるいくつかの困難を明らかにする：LLMベースの評価者は、社会的相互作用に特化してトレーニングされた言語エージェントの能力を過大評価する傾向がある。

English

Humans learn social skills through both imitation and social interaction. This social learning process is largely understudied by existing research on building language agents. Motivated by this gap, we propose an interactive learning method, SOTOPIA-pi, improving the social intelligence of language agents. This method leverages behavior cloning and self-reinforcement training on filtered social interaction data according to large language model (LLM) ratings. We show that our training method allows a 7B LLM to reach the social goal completion ability of an expert model (GPT-4-based agent), while improving the safety of language agents and maintaining general QA ability on the MMLU benchmark. We also find that this training paradigm uncovers some difficulties in LLM-based evaluation of social intelligence: LLM-based evaluators overestimate the abilities of the language agents trained specifically for social interaction.

SOTOPIA-π: 社会的知能を持つ言語エージェントのインタラクティブ学習

SOTOPIA-π: Interactive Learning of Socially Intelligent Language Agents

要旨

Support