視覚ベースの追跡・回避ロボットポリシーの学習

要旨

現実世界の制約下で、追跡・逃避の相互作用に必要なような戦略的なロボット行動を学習することは極めて困難です。これには、相互作用のダイナミクスを活用し、物理的な状態と潜在的な意図の不確実性の両方を考慮した計画が必要です。本論文では、この難解な問題を教師あり学習問題に変換し、完全観測可能なロボットポリシーが部分観測可能なポリシーのための教師信号を生成するアプローチを提案します。部分観測可能な追跡ポリシーの教師信号の質は、逃避者の行動の多様性と最適性のバランス、および完全観測可能なポリシーのモデル仮定の強さという2つの重要な要素に依存することがわかりました。私たちは、このポリシーをRGB-Dカメラを搭載した物理的な四足歩行ロボットに実装し、野外での追跡・逃避相互作用に適用しました。すべての課題にもかかわらず、センシングの制約は創造性を引き出します：ロボットは不確実な状況で情報を収集し、ノイズの多い測定値から意図を予測し、迎撃するために先回りすることを求められます。プロジェクトのウェブページ: https://abajcsy.github.io/vision-based-pursuit/

English

Learning strategic robot behavior -- like that required in pursuit-evasion interactions -- under real-world constraints is extremely challenging. It requires exploiting the dynamics of the interaction, and planning through both physical state and latent intent uncertainty. In this paper, we transform this intractable problem into a supervised learning problem, where a fully-observable robot policy generates supervision for a partially-observable one. We find that the quality of the supervision signal for the partially-observable pursuer policy depends on two key factors: the balance of diversity and optimality of the evader's behavior and the strength of the modeling assumptions in the fully-observable policy. We deploy our policy on a physical quadruped robot with an RGB-D camera on pursuit-evasion interactions in the wild. Despite all the challenges, the sensing constraints bring about creativity: the robot is pushed to gather information when uncertain, predict intent from noisy measurements, and anticipate in order to intercept. Project webpage: https://abajcsy.github.io/vision-based-pursuit/

視覚ベースの追跡・回避ロボットポリシーの学習

Learning Vision-based Pursuit-Evasion Robot Policies

要旨

Support