DynaGuide：通过主动动态引导调控扩散策略

摘要

在现实世界中部署大型复杂策略时，需要具备根据情境需求调整策略的能力。最常见的调整方法，如目标条件化，通常要求机器人策略在训练时考虑到测试阶段的目标分布。为了克服这一限制，我们提出了DynaGuide，一种在扩散去噪过程中利用外部动力学模型进行引导的扩散策略调整方法。DynaGuide将动力学模型与基础策略分离，这带来了多项优势，包括能够朝向多个目标进行调整、增强基础策略中表现不足的行为，以及在低质量目标下保持鲁棒性。独立的引导信号还使得DynaGuide能够直接应用于现成的预训练扩散策略。通过一系列模拟和真实实验，我们展示了DynaGuide相较于其他调整方法的性能与特点，在一组CALVIN关节任务中实现了平均70%的调整成功率，并在低质量目标引导下，其表现优于目标条件化方法5.4倍。此外，我们还成功引导了一款现成的真实机器人策略，使其表现出对特定物体的偏好，甚至创造出新颖行为。更多视频和详情请访问项目网站：https://dynaguide.github.io。

English

Deploying large, complex policies in the real world requires the ability to steer them to fit the needs of a situation. Most common steering approaches, like goal-conditioning, require training the robot policy with a distribution of test-time objectives in mind. To overcome this limitation, we present DynaGuide, a steering method for diffusion policies using guidance from an external dynamics model during the diffusion denoising process. DynaGuide separates the dynamics model from the base policy, which gives it multiple advantages, including the ability to steer towards multiple objectives, enhance underrepresented base policy behaviors, and maintain robustness on low-quality objectives. The separate guidance signal also allows DynaGuide to work with off-the-shelf pretrained diffusion policies. We demonstrate the performance and features of DynaGuide against other steering approaches in a series of simulated and real experiments, showing an average steering success of 70% on a set of articulated CALVIN tasks and outperforming goal-conditioning by 5.4x when steered with low-quality objectives. We also successfully steer an off-the-shelf real robot policy to express preference for particular objects and even create novel behavior. Videos and more can be found on the project website: https://dynaguide.github.io

DynaGuide：通过主动动态引导调控扩散策略

DynaGuide: Steering Diffusion Polices with Active Dynamic Guidance

摘要

Support