アジャイルかつ安全：衝突回避型高速脚式移動の学習

要旨

障害物が散在する環境を移動する脚式ロボットは、効率的なタスク実行のために俊敏であると同時に、障害物や人間との衝突を避けるために安全でなければなりません。既存の研究では、安全性を確保するために保守的な制御器（< 1.0 m/s）を開発するか、致命的な衝突を考慮せずに俊敏性に焦点を当てています。本論文では、四足ロボットのための俊敏かつ衝突のない移動を可能にする学習ベースの制御フレームワーク「Agile But Safe（ABS）」を紹介します。ABSは、障害物の中で俊敏なモータースキルを実行するための俊敏なポリシーと、失敗を防ぐためのリカバリーポリシーを含み、高速かつ衝突のないナビゲーションを共同で実現します。ABSにおけるポリシーの切り替えは、学習された制御理論的なリーチ・アボイド価値ネットワークによって制御され、このネットワークはリカバリーポリシーの目的関数としても機能し、ロボットを閉ループで保護します。トレーニングプロセスでは、シミュレーション内で俊敏なポリシー、リーチ・アボイド価値ネットワーク、リカバリーポリシー、および外部知覚表現ネットワークの学習が行われます。これらのトレーニングされたモジュールは、オンボードセンシングと計算を用いて現実世界に直接展開でき、静的な障害物と動的な障害物が混在する狭い屋内および屋外空間での高速かつ衝突のないナビゲーションを実現します。

English

Legged robots navigating cluttered environments must be jointly agile for efficient task execution and safe to avoid collisions with obstacles or humans. Existing studies either develop conservative controllers (< 1.0 m/s) to ensure safety, or focus on agility without considering potentially fatal collisions. This paper introduces Agile But Safe (ABS), a learning-based control framework that enables agile and collision-free locomotion for quadrupedal robots. ABS involves an agile policy to execute agile motor skills amidst obstacles and a recovery policy to prevent failures, collaboratively achieving high-speed and collision-free navigation. The policy switch in ABS is governed by a learned control-theoretic reach-avoid value network, which also guides the recovery policy as an objective function, thereby safeguarding the robot in a closed loop. The training process involves the learning of the agile policy, the reach-avoid value network, the recovery policy, and an exteroception representation network, all in simulation. These trained modules can be directly deployed in the real world with onboard sensing and computation, leading to high-speed and collision-free navigation in confined indoor and outdoor spaces with both static and dynamic obstacles.

アジャイルかつ安全：衝突回避型高速脚式移動の学習

Agile But Safe: Learning Collision-Free High-Speed Legged Locomotion

要旨

Support