SRT-H:基于语言条件模仿学习的层次化自主手术框架
SRT-H: A Hierarchical Framework for Autonomous Surgery via Language Conditioned Imitation Learning
May 15, 2025
作者: Ji Woong Kim, Juo-Tung Chen, Pascal Hansen, Lucy X. Shi, Antony Goldenberg, Samuel Schmidgall, Paul Maria Scheikl, Anton Deguet, Brandon M. White, De Ru Tsai, Richard Cha, Jeffrey Jopling, Chelsea Finn, Axel Krieger
cs.AI
摘要
自主手术的研究主要集中在受控环境下的简单任务自动化。然而,现实世界中的外科应用要求机器人能够在长时间内进行灵巧操作,并能适应人体组织固有的变异性。这些挑战在使用现有的基于逻辑或传统端到端学习方法时仍难以解决。为填补这一空白,我们提出了一种分层框架,用于执行灵巧且长期的手术步骤。我们的方法利用高层策略进行任务规划,低层策略生成机器人轨迹。高层规划器在语言空间中进行规划,生成任务级或纠正性指令,引导机器人完成长期步骤,并纠正低层策略的错误。我们通过在胆囊切除术(一种常见的微创手术)上进行离体实验来验证我们的框架,并通过消融研究评估系统的关键组件。我们的方法在八个未见过的离体胆囊上实现了100%的成功率,完全自主运行,无需人工干预。这项工作展示了手术过程中的步骤级自主性,标志着自主手术系统向临床部署迈出了重要一步。
English
Research on autonomous surgery has largely focused on simple task automation
in controlled environments. However, real-world surgical applications demand
dexterous manipulation over extended durations and generalization to the
inherent variability of human tissue. These challenges remain difficult to
address using existing logic-based or conventional end-to-end learning
approaches. To address this gap, we propose a hierarchical framework for
performing dexterous, long-horizon surgical steps. Our approach utilizes a
high-level policy for task planning and a low-level policy for generating robot
trajectories. The high-level planner plans in language space, generating
task-level or corrective instructions that guide the robot through the
long-horizon steps and correct for the low-level policy's errors. We validate
our framework through ex vivo experiments on cholecystectomy, a
commonly-practiced minimally invasive procedure, and conduct ablation studies
to evaluate key components of the system. Our method achieves a 100\% success
rate across eight unseen ex vivo gallbladders, operating fully autonomously
without human intervention. This work demonstrates step-level autonomy in a
surgical procedure, marking a milestone toward clinical deployment of
autonomous surgical systems.