運用大型語言模型產生具表現力的機器人行為
Generative Expressive Robot Behaviors using Large Language Models
January 26, 2024
作者: Karthik Mahadevan, Jonathan Chien, Noah Brown, Zhuo Xu, Carolina Parada, Fei Xia, Andy Zeng, Leila Takayama, Dorsa Sadigh
cs.AI
摘要
人們利用表達行為有效地與他人溝通和協調行動,例如點頭以示承認他人的注視,或說“對不起”以在繁忙的走廊中通過他人。我們希望機器人在人機互動中也能展示表達行為。先前的研究提出基於規則的方法,但很難擴展到新的溝通模式或社交情境,而數據驅動方法則需要針對機器人使用的每個社交情境定製的專門數據集。我們提出利用大型語言模型(LLMs)提供的豐富社交背景以及它們根據指示或用戶喜好生成動作的能力,來生成具有適應性和可組合性的表達性機器人動作,逐步積累。我們的方法利用少樣本連貫思維提示,將人類語言指令轉換為參數化控制代碼,利用機器人已有和學習到的技能。通過用戶研究和模擬實驗,我們證明我們的方法產生的行為被用戶認為是能幹且易於理解的。補充資料可在https://generative-expressive-motion.github.io/找到。
English
People employ expressive behaviors to effectively communicate and coordinate
their actions with others, such as nodding to acknowledge a person glancing at
them or saying "excuse me" to pass people in a busy corridor. We would like
robots to also demonstrate expressive behaviors in human-robot interaction.
Prior work proposes rule-based methods that struggle to scale to new
communication modalities or social situations, while data-driven methods
require specialized datasets for each social situation the robot is used in. We
propose to leverage the rich social context available from large language
models (LLMs) and their ability to generate motion based on instructions or
user preferences, to generate expressive robot motion that is adaptable and
composable, building upon each other. Our approach utilizes few-shot
chain-of-thought prompting to translate human language instructions into
parametrized control code using the robot's available and learned skills.
Through user studies and simulation experiments, we demonstrate that our
approach produces behaviors that users found to be competent and easy to
understand. Supplementary material can be found at
https://generative-expressive-motion.github.io/.