更大、更好、更快:具有人类效率的人类水平Atari
Bigger, Better, Faster: Human-level Atari with human-level efficiency
May 30, 2023
作者: Max Schwarzer, Johan Obando-Ceron, Aaron Courville, Marc Bellemare, Rishabh Agarwal, Pablo Samuel Castro
cs.AI
摘要
我们介绍了一种基于价值的强化学习代理,我们称之为BBF,在Atari 100K基准测试中实现了超人类的表现。BBF依赖于对用于价值估计的神经网络进行缩放,以及一些其他设计选择,这些选择使得在样本有效的情况下进行这种缩放成为可能。我们对这些设计选择进行了广泛的分析,并为未来的工作提供了见解。最后,我们讨论了如何更新关于在ALE上进行样本有效的强化学习研究的目标。我们将我们的代码和数据公开发布在https://github.com/google-research/google-research/tree/master/bigger_better_faster。
English
We introduce a value-based RL agent, which we call BBF, that achieves
super-human performance in the Atari 100K benchmark. BBF relies on scaling the
neural networks used for value estimation, as well as a number of other design
choices that enable this scaling in a sample-efficient manner. We conduct
extensive analyses of these design choices and provide insights for future
work. We end with a discussion about updating the goalposts for
sample-efficient RL research on the ALE. We make our code and data publicly
available at
https://github.com/google-research/google-research/tree/master/bigger_better_faster.