SOTOPIA-π: 사회적으로 지능적인 언어 에이전트의 상호작용 학습

초록

인간은 모방과 사회적 상호작용을 통해 사회적 기술을 학습합니다. 이러한 사회적 학습 과정은 기존의 언어 에이전트 구축 연구에서 크게 간과되어 왔습니다. 이러한 격차를 해소하고자, 우리는 언어 에이전트의 사회적 지능을 향상시키는 상호작용적 학습 방법인 SOTOPIA-pi를 제안합니다. 이 방법은 대규모 언어 모델(LLM) 평가에 따라 필터링된 사회적 상호작용 데이터에 대한 행동 복제와 자기 강화 학습을 활용합니다. 우리는 이 학습 방법이 7B LLM이 전문가 모델(GPT-4 기반 에이전트)의 사회적 목표 달성 능력에 도달하도록 하면서도, 언어 에이전트의 안전성을 향상시키고 MMLU 벤치마크에서의 일반적인 질의응답 능력을 유지함을 보여줍니다. 또한, 이 학습 패러다임이 사회적 지능 평가에 있어 LLM 기반 평가의 어려움을 드러낸다는 점을 발견했습니다: LLM 기반 평가자는 사회적 상호작용을 위해 특별히 훈련된 언어 에이전트의 능력을 과대평가하는 경향이 있습니다.

English

Humans learn social skills through both imitation and social interaction. This social learning process is largely understudied by existing research on building language agents. Motivated by this gap, we propose an interactive learning method, SOTOPIA-pi, improving the social intelligence of language agents. This method leverages behavior cloning and self-reinforcement training on filtered social interaction data according to large language model (LLM) ratings. We show that our training method allows a 7B LLM to reach the social goal completion ability of an expert model (GPT-4-based agent), while improving the safety of language agents and maintaining general QA ability on the MMLU benchmark. We also find that this training paradigm uncovers some difficulties in LLM-based evaluation of social intelligence: LLM-based evaluators overestimate the abilities of the language agents trained specifically for social interaction.

SOTOPIA-π: 사회적으로 지능적인 언어 에이전트의 상호작용 학습

SOTOPIA-π: Interactive Learning of Socially Intelligent Language Agents

초록

Support