灵活但安全:学习无碰撞高速腿式运动
Agile But Safe: Learning Collision-Free High-Speed Legged Locomotion
January 31, 2024
作者: Tairan He, Chong Zhang, Wenli Xiao, Guanqi He, Changliu Liu, Guanya Shi
cs.AI
摘要
在杂乱环境中导航的四足机器人必须同时具备灵活性,以便高效执行任务,并确保安全,避免与障碍物或人类发生碰撞。现有研究要么开发保守的控制器(<1.0 m/s)以确保安全,要么专注于灵活性而不考虑潜在的致命碰撞。本文介绍了一种名为敏捷但安全(ABS)的基于学习的控制框架,为四足机器人实现了灵活且无碰撞的运动。ABS包括一种灵活策略,用于在障碍物中执行灵活的运动技能,以及一种恢复策略,用于防止故障,共同实现高速且无碰撞的导航。ABS中的策略切换由一个学习的控制理论到达-避免值网络控制,该网络还作为目标函数指导恢复策略,从而在闭环中保护机器人。训练过程涉及在模拟环境中学习灵活策略、到达-避免值网络、恢复策略和外感知表示网络。这些经过训练的模块可以直接在现实世界中通过机载传感和计算部署,实现在受限的室内和室外空间中高速且无碰撞的导航,包括静态和动态障碍物。
English
Legged robots navigating cluttered environments must be jointly agile for
efficient task execution and safe to avoid collisions with obstacles or humans.
Existing studies either develop conservative controllers (< 1.0 m/s) to ensure
safety, or focus on agility without considering potentially fatal collisions.
This paper introduces Agile But Safe (ABS), a learning-based control framework
that enables agile and collision-free locomotion for quadrupedal robots. ABS
involves an agile policy to execute agile motor skills amidst obstacles and a
recovery policy to prevent failures, collaboratively achieving high-speed and
collision-free navigation. The policy switch in ABS is governed by a learned
control-theoretic reach-avoid value network, which also guides the recovery
policy as an objective function, thereby safeguarding the robot in a closed
loop. The training process involves the learning of the agile policy, the
reach-avoid value network, the recovery policy, and an exteroception
representation network, all in simulation. These trained modules can be
directly deployed in the real world with onboard sensing and computation,
leading to high-speed and collision-free navigation in confined indoor and
outdoor spaces with both static and dynamic obstacles.