通过深度强化学习实现现实世界中流体引导的刚体控制

摘要

最近在强化学习（RL）在现实世界应用方面取得的进展依赖于准确模拟大规模系统的能力。然而，流体动力学系统等领域展示出复杂的动态现象，很难以高整合速率进行模拟，这限制了现代深度RL算法直接应用于通常昂贵或安全关键硬件的可能性。在这项工作中，我们介绍了“Box o Flows”，这是一个新颖的台式实验控制系统，用于系统评估RL算法在动态现实场景中的表现。我们描述了Box o Flows的关键组件，并通过一系列实验展示了最先进的无模型RL算法如何通过简单的奖励规范合成各种复杂行为。此外，我们探讨了离线RL在数据高效假设测试中通过重复利用过去经验的作用。我们相信，通过这项初步研究所获得的见解以及类似Box o Flows系统的可用性将支持开发系统化RL算法的进程，这些算法可以普遍应用于复杂的动态系统。实验的补充材料和视频可在https://sites.google.com/view/box-o-flows/home 获取。

English

Recent advances in real-world applications of reinforcement learning (RL) have relied on the ability to accurately simulate systems at scale. However, domains such as fluid dynamical systems exhibit complex dynamic phenomena that are hard to simulate at high integration rates, limiting the direct application of modern deep RL algorithms to often expensive or safety critical hardware. In this work, we introduce "Box o Flows", a novel benchtop experimental control system for systematically evaluating RL algorithms in dynamic real-world scenarios. We describe the key components of the Box o Flows, and through a series of experiments demonstrate how state-of-the-art model-free RL algorithms can synthesize a variety of complex behaviors via simple reward specifications. Furthermore, we explore the role of offline RL in data-efficient hypothesis testing by reusing past experiences. We believe that the insights gained from this preliminary study and the availability of systems like the Box o Flows support the way forward for developing systematic RL algorithms that can be generally applied to complex, dynamical systems. Supplementary material and videos of experiments are available at https://sites.google.com/view/box-o-flows/home.

通过深度强化学习实现现实世界中流体引导的刚体控制

Real-World Fluid Directed Rigid Body Control via Deep Reinforcement Learning

摘要

Support