SRT-H：基于语言条件模仿学习的层次化自主手术框架

摘要

自主手术的研究主要集中在受控环境下的简单任务自动化。然而，现实世界中的外科应用要求机器人能够在长时间内进行灵巧操作，并能适应人体组织固有的变异性。这些挑战在使用现有的基于逻辑或传统端到端学习方法时仍难以解决。为填补这一空白，我们提出了一种分层框架，用于执行灵巧且长期的手术步骤。我们的方法利用高层策略进行任务规划，低层策略生成机器人轨迹。高层规划器在语言空间中进行规划，生成任务级或纠正性指令，引导机器人完成长期步骤，并纠正低层策略的错误。我们通过在胆囊切除术（一种常见的微创手术）上进行离体实验来验证我们的框架，并通过消融研究评估系统的关键组件。我们的方法在八个未见过的离体胆囊上实现了100%的成功率，完全自主运行，无需人工干预。这项工作展示了手术过程中的步骤级自主性，标志着自主手术系统向临床部署迈出了重要一步。

English

Research on autonomous surgery has largely focused on simple task automation in controlled environments. However, real-world surgical applications demand dexterous manipulation over extended durations and generalization to the inherent variability of human tissue. These challenges remain difficult to address using existing logic-based or conventional end-to-end learning approaches. To address this gap, we propose a hierarchical framework for performing dexterous, long-horizon surgical steps. Our approach utilizes a high-level policy for task planning and a low-level policy for generating robot trajectories. The high-level planner plans in language space, generating task-level or corrective instructions that guide the robot through the long-horizon steps and correct for the low-level policy's errors. We validate our framework through ex vivo experiments on cholecystectomy, a commonly-practiced minimally invasive procedure, and conduct ablation studies to evaluate key components of the system. Our method achieves a 100\% success rate across eight unseen ex vivo gallbladders, operating fully autonomously without human intervention. This work demonstrates step-level autonomy in a surgical procedure, marking a milestone toward clinical deployment of autonomous surgical systems.

SRT-H：基于语言条件模仿学习的层次化自主手术框架

SRT-H: A Hierarchical Framework for Autonomous Surgery via Language Conditioned Imitation Learning

摘要

Support