ODYSSEY:面向长时程任务的开源四足机器人探索与操作平台
ODYSSEY: Open-World Quadrupeds Exploration and Manipulation for Long-Horizon Tasks
August 11, 2025
作者: Kaijun Wang, Liqin Lu, Mingyu Liu, Jianuo Jiang, Zeju Li, Bolin Zhang, Wancai Zheng, Xinyi Yu, Hao Chen, Chunhua Shen
cs.AI
摘要
语言引导的长时程移动操作一直是具身语义推理、通用化操作和自适应运动中的重大挑战。三大根本性限制阻碍了进展:首先,尽管大型语言模型通过语义先验提升了空间推理和任务规划能力,现有实现仍局限于桌面场景,未能解决移动平台受限的感知能力和有限的操作范围问题。其次,当前的操作策略在面对开放世界环境中多样化的物体配置时,表现出不足的泛化能力。第三,尽管对于实际部署至关重要,在非结构化环境中同时保持高平台机动性和精确末端执行器控制的双重要求仍未被充分研究。
在本研究中,我们提出了ODYSSEY,一个为配备机械臂的敏捷四足机器人设计的统一移动操作框架,它无缝集成了高层任务规划与低层全身控制。针对语言条件任务中的自我中心感知挑战,我们引入了一个由视觉-语言模型驱动的分层规划器,实现了长时程指令分解和精确动作执行。在控制层面,我们新颖的全身策略实现了在复杂地形上的稳健协调。我们进一步提出了首个长时程移动操作基准,评估了多样化的室内外场景。通过成功的仿真到现实迁移,我们展示了系统在现实世界部署中的泛化能力和鲁棒性,凸显了腿式操作器在非结构化环境中的实用性。我们的工作推进了能够执行复杂动态任务的通用机器人助手的可行性。项目页面:https://kaijwang.github.io/odyssey.github.io/
English
Language-guided long-horizon mobile manipulation has long been a grand
challenge in embodied semantic reasoning, generalizable manipulation, and
adaptive locomotion. Three fundamental limitations hinder progress: First,
although large language models have improved spatial reasoning and task
planning through semantic priors, existing implementations remain confined to
tabletop scenarios, failing to address the constrained perception and limited
actuation ranges of mobile platforms. Second, current manipulation strategies
exhibit insufficient generalization when confronted with the diverse object
configurations encountered in open-world environments. Third, while crucial for
practical deployment, the dual requirement of maintaining high platform
maneuverability alongside precise end-effector control in unstructured settings
remains understudied.
In this work, we present ODYSSEY, a unified mobile manipulation framework for
agile quadruped robots equipped with manipulators, which seamlessly integrates
high-level task planning with low-level whole-body control. To address the
challenge of egocentric perception in language-conditioned tasks, we introduce
a hierarchical planner powered by a vision-language model, enabling
long-horizon instruction decomposition and precise action execution. At the
control level, our novel whole-body policy achieves robust coordination across
challenging terrains. We further present the first benchmark for long-horizon
mobile manipulation, evaluating diverse indoor and outdoor scenarios. Through
successful sim-to-real transfer, we demonstrate the system's generalization and
robustness in real-world deployments, underscoring the practicality of legged
manipulators in unstructured environments. Our work advances the feasibility of
generalized robotic assistants capable of complex, dynamic tasks. Our project
page: https://kaijwang.github.io/odyssey.github.io/