ChatPaper.aiChatPaper

更大、更好、更快:具有人类效率的人类水平Atari

Bigger, Better, Faster: Human-level Atari with human-level efficiency

May 30, 2023
作者: Max Schwarzer, Johan Obando-Ceron, Aaron Courville, Marc Bellemare, Rishabh Agarwal, Pablo Samuel Castro
cs.AI

摘要

我们介绍了一种基于价值的强化学习代理,我们称之为BBF,在Atari 100K基准测试中实现了超人类的表现。BBF依赖于对用于价值估计的神经网络进行缩放,以及一些其他设计选择,这些选择使得在样本有效的情况下进行这种缩放成为可能。我们对这些设计选择进行了广泛的分析,并为未来的工作提供了见解。最后,我们讨论了如何更新关于在ALE上进行样本有效的强化学习研究的目标。我们将我们的代码和数据公开发布在https://github.com/google-research/google-research/tree/master/bigger_better_faster。
English
We introduce a value-based RL agent, which we call BBF, that achieves super-human performance in the Atari 100K benchmark. BBF relies on scaling the neural networks used for value estimation, as well as a number of other design choices that enable this scaling in a sample-efficient manner. We conduct extensive analyses of these design choices and provide insights for future work. We end with a discussion about updating the goalposts for sample-efficient RL research on the ALE. We make our code and data publicly available at https://github.com/google-research/google-research/tree/master/bigger_better_faster.
PDF40December 15, 2024