GR-Dexter技术报告
GR-Dexter Technical Report
December 30, 2025
作者: Ruoshi Wen, Guangzeng Chen, Zhongren Cui, Min Du, Yang Gou, Zhigang Han, Liqun Huang, Mingyu Lei, Yunfei Li, Zhuohang Li, Wenlei Liu, Yuxiao Liu, Xiao Ma, Hao Niu, Yutao Ouyang, Zeyu Ren, Haixin Shi, Wei Xu, Haoxiang Zhang, Jiajun Zhang, Xiao Zhang, Liwei Zheng, Weiheng Zhong, Yifei Zhou, Zhengming Zhu, Hang Li
cs.AI
摘要
視覺-語言-動作模型已實現了語言條件化的長時程機器人操作,但現有系統多侷限於夾爪機械手。將VLA策略擴展至配備高自由度靈巧雙手的雙臂機器人仍面臨挑戰,原因在於動作空間擴展、頻繁的手物遮擋以及真實機器人數據採集成本。我們提出GR-Dexter——一個面向雙臂靈巧手機器人的VLA通用操作整體框架,整合了緊湊型21自由度機械手設計、直觀的雙臂遙操作數據採集系統,以及融合遙操作軌跡、大規模視覺語言數據與精選跨具身數據的訓練方案。在涵蓋長時程日常操作與泛化性抓放的現實場景測試中,GR-Dexter不僅展現出優異的域內性能,更對未見過物體與指令表現出更強健的適應能力。我們期望GR-Dexter能為實現通用靈巧手機器人操作邁出實質性一步。
English
Vision-language-action (VLA) models have enabled language-conditioned, long-horizon robot manipulation, but most existing systems are limited to grippers. Scaling VLA policies to bimanual robots with high degree-of-freedom (DoF) dexterous hands remains challenging due to the expanded action space, frequent hand-object occlusions, and the cost of collecting real-robot data. We present GR-Dexter, a holistic hardware-model-data framework for VLA-based generalist manipulation on a bimanual dexterous-hand robot. Our approach combines the design of a compact 21-DoF robotic hand, an intuitive bimanual teleoperation system for real-robot data collection, and a training recipe that leverages teleoperated robot trajectories together with large-scale vision-language and carefully curated cross-embodiment datasets. Across real-world evaluations spanning long-horizon everyday manipulation and generalizable pick-and-place, GR-Dexter achieves strong in-domain performance and improved robustness to unseen objects and unseen instructions. We hope GR-Dexter serves as a practical step toward generalist dexterous-hand robotic manipulation.