ChatPaper.aiChatPaper

机器人挑战赛:具身策略的大规模实体机器人评估

RoboChallenge: Large-scale Real-robot Evaluation of Embodied Policies

October 20, 2025
作者: Adina Yakefu, Bin Xie, Chongyang Xu, Enwen Zhang, Erjin Zhou, Fan Jia, Haitao Yang, Haoqiang Fan, Haowei Zhang, Hongyang Peng, Jing Tan, Junwen Huang, Kai Liu, Kaixin Liu, Kefan Gu, Qinglun Zhang, Ruitao Zhang, Saike Huang, Shen Cheng, Shuaicheng Liu, Tiancai Wang, Tiezhen Wang, Wei Sun, Wenbin Tang, Yajun Wei, Yang Chen, Youqiang Gui, Yucheng Zhao, Yunchao Ma, Yunfei Wei, Yunhuan Yang, Yutong Guo, Ze Chen, Zhengyuan Du, Ziheng Zhang, Ziming Liu, Ziwei Yan
cs.AI

摘要

机器人控制算法的实机测试不可或缺。对于基于学习的算法,特别是视觉语言动作模型而言,大规模评估(即在大量任务上测试大量模型)的需求日益迫切。然而要实现可扩展且可复现的高质量评估绝非易事。本报告阐述了我们构建在线机器人控制算法评估系统RoboChallenge的方法论,以及通过初始基准测试集Table30对当前最先进VLA模型开展的调研成果。
English
Testing on real machines is indispensable for robotic control algorithms. In the context of learning-based algorithms, especially VLA models, demand for large-scale evaluation, i.e. testing a large number of models on a large number of tasks, is becoming increasingly urgent. However, doing this right is highly non-trivial, especially when scalability and reproducibility is taken into account. In this report, we describe our methodology for constructing RoboChallenge, an online evaluation system to test robotic control algorithms, and our survey of recent state-of-the-art VLA models using our initial benchmark Table30.
PDF72December 2, 2025