机器人学中的基础模型:应用、挑战和未来。
Foundation Models in Robotics: Applications, Challenges, and the Future
December 13, 2023
作者: Roya Firoozi, Johnathan Tucker, Stephen Tian, Anirudha Majumdar, Jiankai Sun, Weiyu Liu, Yuke Zhu, Shuran Song, Ashish Kapoor, Karol Hausman, Brian Ichter, Danny Driess, Jiajun Wu, Cewu Lu, Mac Schwager
cs.AI
摘要
我们调查了预训练基础模型在机器人领域的应用。传统的机器人深度学习模型是在针对特定任务定制的小数据集上训练的,这限制了它们在不同应用中的适应性。相比之下,基础模型是在互联网规模数据上预训练的,似乎具有更优越的泛化能力,并且在某些情况下展现出发现训练数据中不存在的问题的零样本解决方案的能力。基础模型可能具有增强机器人自主堆栈各个组件能力的潜力,从感知到决策制定和控制。例如,大型语言模型可以生成代码或提供常识推理,而视觉-语言模型可以实现开放词汇的视觉识别。然而,仍然存在重要的开放性研究挑战,特别是围绕机器人相关训练数据的稀缺性、安全保证和不确定性量化,以及实时执行。在这项调查中,我们研究了最近使用或构建基础模型来解决机器人问题的论文。我们探讨基础模型如何有助于改进机器人在感知、决策制定和控制领域的能力。我们讨论了阻碍基础模型在机器人自主性中的采用的挑战,并提供了未来发展的机会和潜在途径。本文对应的GitHub项目(初步发布。我们致力于进一步增强和更新这项工作,以确保其质量和相关性)可以在此处找到:https://github.com/robotics-survey/Awesome-Robotics-Foundation-Models
English
We survey applications of pretrained foundation models in robotics.
Traditional deep learning models in robotics are trained on small datasets
tailored for specific tasks, which limits their adaptability across diverse
applications. In contrast, foundation models pretrained on internet-scale data
appear to have superior generalization capabilities, and in some instances
display an emergent ability to find zero-shot solutions to problems that are
not present in the training data. Foundation models may hold the potential to
enhance various components of the robot autonomy stack, from perception to
decision-making and control. For example, large language models can generate
code or provide common sense reasoning, while vision-language models enable
open-vocabulary visual recognition. However, significant open research
challenges remain, particularly around the scarcity of robot-relevant training
data, safety guarantees and uncertainty quantification, and real-time
execution. In this survey, we study recent papers that have used or built
foundation models to solve robotics problems. We explore how foundation models
contribute to improving robot capabilities in the domains of perception,
decision-making, and control. We discuss the challenges hindering the adoption
of foundation models in robot autonomy and provide opportunities and potential
pathways for future advancements. The GitHub project corresponding to this
paper (Preliminary release. We are committed to further enhancing and updating
this work to ensure its quality and relevance) can be found here:
https://github.com/robotics-survey/Awesome-Robotics-Foundation-Models