接触锚定策略:接觸條件化構建強健的機器人效用模型
Contact-Anchored Policies: Contact Conditioning Creates Strong Robot Utility Models
February 9, 2026
作者: Zichen Jeff Cui, Omar Rayyan, Haritheja Etukuru, Bowen Tan, Zavier Andrianarivo, Zicheng Teng, Yihang Zhou, Krish Mehta, Nicholas Wojno, Kevin Yuanbo Wu, Manan H Anjaria, Ziyuan Wu, Manrong Mao, Guangxun Zhang, Binit Shah, Yejin Kim, Soumith Chintala, Lerrel Pinto, Nur Muhammad Mahi Shafiullah
cs.AI
摘要
当前机器人学习的主流范式试图通过运行时语言提示来实现跨环境、具身形态和任务的泛化。但这种方法存在一个根本矛盾:语言往往过于抽象,难以指导稳健操作所需的具体物理理解。本研究提出接触锚定策略(CAP),用空间物理接触点替代语言条件约束。同时,我们将CAP构建为模块化功能模型库而非单一通用策略。这种分解式设计使我们能够实施实境-仿真迭代循环:通过构建轻量级仿真基准EgoGym,在真实场景部署前快速识别故障模式并优化模型与数据集。实验表明,基于接触条件约束和仿真迭代的CAP仅需23小时演示数据即可实现三种基础操作技能的新环境与新具身形态开箱泛化,在零样本评估中优于当前最先进的大规模视觉语言动作模型56%。所有模型检查点、代码库、硬件方案、仿真环境及数据集将全面开源。项目页面:https://cap-policy.github.io/
English
The prevalent paradigm in robot learning attempts to generalize across environments, embodiments, and tasks with language prompts at runtime. A fundamental tension limits this approach: language is often too abstract to guide the concrete physical understanding required for robust manipulation. In this work, we introduce Contact-Anchored Policies (CAP), which replace language conditioning with points of physical contact in space. Simultaneously, we structure CAP as a library of modular utility models rather than a monolithic generalist policy. This factorization allows us to implement a real-to-sim iteration cycle: we build EgoGym, a lightweight simulation benchmark, to rapidly identify failure modes and refine our models and datasets prior to real-world deployment. We show that by conditioning on contact and iterating via simulation, CAP generalizes to novel environments and embodiments out of the box on three fundamental manipulation skills while using only 23 hours of demonstration data, and outperforms large, state-of-the-art VLAs in zero-shot evaluations by 56%. All model checkpoints, codebase, hardware, simulation, and datasets will be open-sourced. Project page: https://cap-policy.github.io/