OpenHelix:機器人操作之簡要綜述、實證分析與開源雙系統VLA模型
OpenHelix: A Short Survey, Empirical Analysis, and Open-Source Dual-System VLA Model for Robotic Manipulation
May 6, 2025
作者: Can Cui, Pengxiang Ding, Wenxuan Song, Shuanghao Bai, Xinyang Tong, Zirui Ge, Runze Suo, Wanqi Zhou, Yang Liu, Bofang Jia, Han Zhao, Siteng Huang, Donglin Wang
cs.AI
摘要
雙系統視覺-語言-動作(VLA)架構已成為具身智能研究的熱點,但現有開源工作尚不足以支持進一步的性能分析與優化。為解決這一問題,本文將總結並比較現有雙系統架構的結構設計,並對其核心設計要素進行系統性的實證評估。最終,本文將提供一個低成本的開源模型,以供進一步探索。當然,該項目將持續更新,提供更多實驗結論及性能更優的開源模型供大家選擇。項目頁面:https://openhelix-robot.github.io/。
English
Dual-system VLA (Vision-Language-Action) architectures have become a hot
topic in embodied intelligence research, but there is a lack of sufficient
open-source work for further performance analysis and optimization. To address
this problem, this paper will summarize and compare the structural designs of
existing dual-system architectures, and conduct systematic empirical
evaluations on the core design elements of existing dual-system architectures.
Ultimately, it will provide a low-cost open-source model for further
exploration. Of course, this project will continue to update with more
experimental conclusions and open-source models with improved performance for
everyone to choose from. Project page: https://openhelix-robot.github.io/.Summary
AI-Generated Summary