SOTOPIA-π：社交智能語言代理的互動式學習

摘要

人類通過模仿和社交互動來學習社交技能。這種社交學習過程在現有建構語言代理的研究中很大程度上被忽視。受到這一缺口的激勵，我們提出了一種互動學習方法，即SOTOPIA-pi，以提高語言代理的社交智能。該方法利用行為克隆和根據大型語言模型（LLM）評分對過濾的社交互動數據進行自我強化訓練。我們展示了我們的訓練方法使得一個7B的LLM能夠達到專家模型（基於GPT-4的代理）的社交目標完成能力，同時提高了語言代理的安全性，並在MMLU基準上保持了一般的問答能力。我們還發現，這種訓練範式揭示了LLM評估社交智能的一些困難：基於LLM的評估者高估了專門為社交互動訓練的語言代理的能力。

English

Humans learn social skills through both imitation and social interaction. This social learning process is largely understudied by existing research on building language agents. Motivated by this gap, we propose an interactive learning method, SOTOPIA-pi, improving the social intelligence of language agents. This method leverages behavior cloning and self-reinforcement training on filtered social interaction data according to large language model (LLM) ratings. We show that our training method allows a 7B LLM to reach the social goal completion ability of an expert model (GPT-4-based agent), while improving the safety of language agents and maintaining general QA ability on the MMLU benchmark. We also find that this training paradigm uncovers some difficulties in LLM-based evaluation of social intelligence: LLM-based evaluators overestimate the abilities of the language agents trained specifically for social interaction.

SOTOPIA-π：社交智能語言代理的互動式學習

SOTOPIA-π: Interactive Learning of Socially Intelligent Language Agents

摘要

Support