利用大型语言模型生成具有表现力的机器人行为

摘要

人们利用表达行为有效地与他人沟通和协调行动，比如点头以示认可别人的注视，或说“对不起”以在拥挤的走廊中通行。我们希望机器人在人机交互中也能展示表达行为。先前的研究提出了基于规则的方法，但这些方法难以扩展到新的沟通模式或社交场景，而数据驱动的方法则需要针对机器人使用的每种社交场景专门的数据集。我们建议利用大型语言模型（LLMs）提供的丰富社交背景以及它们根据指令或用户偏好生成动作的能力，来生成具有适应性和可组合性的表达性机器人动作，这些动作可以相互构建。我们的方法利用少样本连贯思维提示，将人类语言指令转换为参数化控制代码，利用机器人已有的和学习到的技能。通过用户研究和模拟实验，我们展示了我们的方法产生的行为被用户认为是胜任且易于理解的。补充材料可在https://generative-expressive-motion.github.io/找到。

English

People employ expressive behaviors to effectively communicate and coordinate their actions with others, such as nodding to acknowledge a person glancing at them or saying "excuse me" to pass people in a busy corridor. We would like robots to also demonstrate expressive behaviors in human-robot interaction. Prior work proposes rule-based methods that struggle to scale to new communication modalities or social situations, while data-driven methods require specialized datasets for each social situation the robot is used in. We propose to leverage the rich social context available from large language models (LLMs) and their ability to generate motion based on instructions or user preferences, to generate expressive robot motion that is adaptable and composable, building upon each other. Our approach utilizes few-shot chain-of-thought prompting to translate human language instructions into parametrized control code using the robot's available and learned skills. Through user studies and simulation experiments, we demonstrate that our approach produces behaviors that users found to be competent and easy to understand. Supplementary material can be found at https://generative-expressive-motion.github.io/.

利用大型语言模型生成具有表现力的机器人行为

Generative Expressive Robot Behaviors using Large Language Models

摘要

Support