Hy-Embodied-0.5-VLA：从视觉-语言-动作模型到真实世界机器人学习栈

摘要

本报告介绍Hy-Embodied-0.5-VLA（简称HyVLA-0.5），这是一套覆盖机器人学习全栈的端到端系统，具体包括数据采集、模型设计、持续预训练与监督微调、强化学习后训练以及实际场景部署。该堆栈中的每个组件均承担着独特的功能。

English

In this report, we present Hy-Embodied-0.5-VLA, abbreviated as HyVLA-0.5, an end-to-end system that spans the full robot learning stack: data collection, model design, continued pre-training and supervised fine-tuning, RL post-training, and real-world deployment. Each component serves a distinct role in this stack.