通过深度强化学习实现现实世界中流体引导的刚体控制
Real-World Fluid Directed Rigid Body Control via Deep Reinforcement Learning
February 8, 2024
作者: Mohak Bhardwaj, Thomas Lampe, Michael Neunert, Francesco Romano, Abbas Abdolmaleki, Arunkumar Byravan, Markus Wulfmeier, Martin Riedmiller, Jonas Buchli
cs.AI
摘要
最近在强化学习(RL)在现实世界应用方面取得的进展依赖于准确模拟大规模系统的能力。然而,流体动力学系统等领域展示出复杂的动态现象,很难以高整合速率进行模拟,这限制了现代深度RL算法直接应用于通常昂贵或安全关键硬件的可能性。在这项工作中,我们介绍了“Box o Flows”,这是一个新颖的台式实验控制系统,用于系统评估RL算法在动态现实场景中的表现。我们描述了Box o Flows的关键组件,并通过一系列实验展示了最先进的无模型RL算法如何通过简单的奖励规范合成各种复杂行为。此外,我们探讨了离线RL在数据高效假设测试中通过重复利用过去经验的作用。我们相信,通过这项初步研究所获得的见解以及类似Box o Flows系统的可用性将支持开发系统化RL算法的进程,这些算法可以普遍应用于复杂的动态系统。实验的补充材料和视频可在https://sites.google.com/view/box-o-flows/home 获取。
English
Recent advances in real-world applications of reinforcement learning (RL)
have relied on the ability to accurately simulate systems at scale. However,
domains such as fluid dynamical systems exhibit complex dynamic phenomena that
are hard to simulate at high integration rates, limiting the direct application
of modern deep RL algorithms to often expensive or safety critical hardware. In
this work, we introduce "Box o Flows", a novel benchtop experimental control
system for systematically evaluating RL algorithms in dynamic real-world
scenarios. We describe the key components of the Box o Flows, and through a
series of experiments demonstrate how state-of-the-art model-free RL algorithms
can synthesize a variety of complex behaviors via simple reward specifications.
Furthermore, we explore the role of offline RL in data-efficient hypothesis
testing by reusing past experiences. We believe that the insights gained from
this preliminary study and the availability of systems like the Box o Flows
support the way forward for developing systematic RL algorithms that can be
generally applied to complex, dynamical systems. Supplementary material and
videos of experiments are available at
https://sites.google.com/view/box-o-flows/home.