SOTOPIA-π：社交智能语言代理的交互式学习

摘要

人类通过模仿和社交互动学习社交技能。这种社会学习过程在现有研究建立语言代理方面很大程度上被忽视。受到这一空白的启发，我们提出了一种交互式学习方法，SOTOPIA-pi，用于提高语言代理的社交智能。该方法利用行为克隆和根据大型语言模型（LLM）评分对过滤后的社交互动数据进行自我强化训练。我们展示了我们的训练方法使得一个7B规模的LLM能够达到专家模型（基于GPT-4的代理）的社交目标完成能力，同时提高了语言代理的安全性，并在MMLU基准上保持了一般的问答能力。我们还发现这种训练范式揭示了LLM评估社交智能的一些困难：基于LLM的评估者高估了专门针对社交互动训练的语言代理的能力。

English

Humans learn social skills through both imitation and social interaction. This social learning process is largely understudied by existing research on building language agents. Motivated by this gap, we propose an interactive learning method, SOTOPIA-pi, improving the social intelligence of language agents. This method leverages behavior cloning and self-reinforcement training on filtered social interaction data according to large language model (LLM) ratings. We show that our training method allows a 7B LLM to reach the social goal completion ability of an expert model (GPT-4-based agent), while improving the safety of language agents and maintaining general QA ability on the MMLU benchmark. We also find that this training paradigm uncovers some difficulties in LLM-based evaluation of social intelligence: LLM-based evaluators overestimate the abilities of the language agents trained specifically for social interaction.

SOTOPIA-π：社交智能语言代理的交互式学习

SOTOPIA-π: Interactive Learning of Socially Intelligent Language Agents

摘要

Support