ChatPaper.aiChatPaper

Hy-Embodied-0.5-VLA:从视觉-语言-动作模型到真实世界机器人学习栈

Hy-Embodied-0.5-VLA: From Vision-Language-Action Models to a Real-World Robot Learning Stack

June 12, 2026
作者: He Zhang, Lingzhu Xiang, Haitao Lin, Zeyu Huang, Minghui Wang, Dingyan Zhong, Yubo Dong, Yihao Wu, Yongming Rao, Dongsheng Zhang, Wanjia He, Ling Chen, Kai Huang, Jiahao Chen, Sichang Su, Xumin Yu, Ziyi Wang, Chengwei Zhu, Xiao Teng, Yuchun Guo, Yufeng Zhang, Yuandong Liu, Rui Wang, Zisheng Lu, Han Hu, Zhengyou Zhang
cs.AI

摘要

本报告介绍Hy-Embodied-0.5-VLA(简称HyVLA-0.5),这是一套覆盖机器人学习全栈的端到端系统,具体包括数据采集、模型设计、持续预训练与监督微调、强化学习后训练以及实际场景部署。该堆栈中的每个组件均承担着独特的功能。
English
In this report, we present Hy-Embodied-0.5-VLA, abbreviated as HyVLA-0.5, an end-to-end system that spans the full robot learning stack: data collection, model design, continued pre-training and supervised fine-tuning, RL post-training, and real-world deployment. Each component serves a distinct role in this stack.