ChatPaper.aiChatPaper

機器人學中的基礎模型:應用、挑戰與未來

Foundation Models in Robotics: Applications, Challenges, and the Future

December 13, 2023
作者: Roya Firoozi, Johnathan Tucker, Stephen Tian, Anirudha Majumdar, Jiankai Sun, Weiyu Liu, Yuke Zhu, Shuran Song, Ashish Kapoor, Karol Hausman, Brian Ichter, Danny Driess, Jiajun Wu, Cewu Lu, Mac Schwager
cs.AI

摘要

我們調查了預訓練基礎模型在機器人領域的應用。傳統的深度學習模型在機器人領域通常是在針對特定任務定制的小數據集上進行訓練,這限制了它們在不同應用中的適應性。相比之下,通過在互聯網規模數據上預訓練的基礎模型似乎具有更優越的泛化能力,在某些情況下展現出發現零樣本解決方案的能力,這些問題在訓練數據中並不存在。基礎模型可能具有增強機器人自主堆棧各個組件能力的潛力,從感知到決策制定和控制。例如,大型語言模型可以生成代碼或提供常識推理,而視覺語言模型可以實現開放詞彙的視覺識別。然而,仍存在重要的開放研究挑戰,特別是圍繞機器人相關訓練數據的稀缺性、安全保證和不確定性量化,以及實時執行。在這份調查中,我們研究了最近使用或構建基礎模型來解決機器人問題的論文。我們探討了基礎模型如何有助於改進機器人在感知、決策制定和控制領域的能力。我們討論了阻礙基礎模型應用於機器人自主性的挑戰,並提供未來發展的機會和潛在途徑。本文對應的GitHub項目(初步版本。我們致力於進一步增強和更新此工作,以確保其質量和相關性)可以在此處找到:https://github.com/robotics-survey/Awesome-Robotics-Foundation-Models
English
We survey applications of pretrained foundation models in robotics. Traditional deep learning models in robotics are trained on small datasets tailored for specific tasks, which limits their adaptability across diverse applications. In contrast, foundation models pretrained on internet-scale data appear to have superior generalization capabilities, and in some instances display an emergent ability to find zero-shot solutions to problems that are not present in the training data. Foundation models may hold the potential to enhance various components of the robot autonomy stack, from perception to decision-making and control. For example, large language models can generate code or provide common sense reasoning, while vision-language models enable open-vocabulary visual recognition. However, significant open research challenges remain, particularly around the scarcity of robot-relevant training data, safety guarantees and uncertainty quantification, and real-time execution. In this survey, we study recent papers that have used or built foundation models to solve robotics problems. We explore how foundation models contribute to improving robot capabilities in the domains of perception, decision-making, and control. We discuss the challenges hindering the adoption of foundation models in robot autonomy and provide opportunities and potential pathways for future advancements. The GitHub project corresponding to this paper (Preliminary release. We are committed to further enhancing and updating this work to ensure its quality and relevance) can be found here: https://github.com/robotics-survey/Awesome-Robotics-Foundation-Models
PDF180December 15, 2024