学习基于视觉的追逐-逃避机器人策略

摘要

学习战略机器人行为——例如在追逐-逃避交互中所需的行为——在现实世界的约束下是极具挑战性的。它需要利用交互的动态，并通过物理状态和潜在意图的不确定性进行规划。在本文中，我们将这个棘手的问题转化为一个监督学习问题，其中一个完全可观测的机器人策略为一个部分可观测的机器人策略生成监督。我们发现，对于部分可观测的追逐者策略来说，监督信号的质量取决于两个关键因素：逃避者行为的多样性和最优性的平衡，以及完全可观测策略中建模假设的强度。我们将我们的策略部署在一台具有RGB-D摄像头的四足机器人上，用于野外的追逐-逃避交互。尽管存在诸多挑战，感知约束也带来了创造力：当不确定时，机器人被推动收集信息，从嘈杂的测量中预测意图，并进行预测以拦截。项目网页：https://abajcsy.github.io/vision-based-pursuit/

English

Learning strategic robot behavior -- like that required in pursuit-evasion interactions -- under real-world constraints is extremely challenging. It requires exploiting the dynamics of the interaction, and planning through both physical state and latent intent uncertainty. In this paper, we transform this intractable problem into a supervised learning problem, where a fully-observable robot policy generates supervision for a partially-observable one. We find that the quality of the supervision signal for the partially-observable pursuer policy depends on two key factors: the balance of diversity and optimality of the evader's behavior and the strength of the modeling assumptions in the fully-observable policy. We deploy our policy on a physical quadruped robot with an RGB-D camera on pursuit-evasion interactions in the wild. Despite all the challenges, the sensing constraints bring about creativity: the robot is pushed to gather information when uncertain, predict intent from noisy measurements, and anticipate in order to intercept. Project webpage: https://abajcsy.github.io/vision-based-pursuit/

学习基于视觉的追逐-逃避机器人策略

Learning Vision-based Pursuit-Evasion Robot Policies

摘要

Support