XLand-100B：一个用于上下文强化学习的大规模多任务数据集

摘要

随着大规模语言和计算机视觉模型中上下文学习范式的成功，最近兴起的上下文强化学习领域正在迅速发展。然而，由于缺乏具有挑战性的基准测试，其发展受到了阻碍，因为所有实验都是在简单环境和小规模数据集中进行的。我们提出了XLand-100B，这是一个基于XLand-MiniGrid环境的大规模数据集，用作缓解这一问题的第一步。它包含了近30,000个不同任务的完整学习历史，涵盖了100B个转换和25亿个情节。收集这一数据集耗费了50,000个GPU小时，这超出了大多数学术实验室的能力范围。除了数据集，我们还提供了工具，以便复制或进一步扩展数据集。通过这一重大努力，我们的目标是使上下文强化学习这一快速增长领域的研究民主化，并为进一步扩展奠定坚实基础。该代码是开源的，可在Apache 2.0许可下通过以下链接获得：https://github.com/dunno-lab/xland-minigrid-datasets。

English

Following the success of the in-context learning paradigm in large-scale language and computer vision models, the recently emerging field of in-context reinforcement learning is experiencing a rapid growth. However, its development has been held back by the lack of challenging benchmarks, as all the experiments have been carried out in simple environments and on small-scale datasets. We present XLand-100B, a large-scale dataset for in-context reinforcement learning based on the XLand-MiniGrid environment, as a first step to alleviate this problem. It contains complete learning histories for nearly 30,000 different tasks, covering 100B transitions and 2.5B episodes. It took 50,000 GPU hours to collect the dataset, which is beyond the reach of most academic labs. Along with the dataset, we provide the utilities to reproduce or expand it even further. With this substantial effort, we aim to democratize research in the rapidly growing field of in-context reinforcement learning and provide a solid foundation for further scaling. The code is open-source and available under Apache 2.0 licence at https://github.com/dunno-lab/xland-minigrid-datasets.

XLand-100B：一个用于上下文强化学习的大规模多任务数据集

XLand-100B: A Large-Scale Multi-Task Dataset for In-Context Reinforcement Learning

摘要

Support