ChatPaper.aiChatPaper

NitroGen:面向通用游戏智能体的开放基础模型

NitroGen: An Open Foundation Model for Generalist Gaming Agents

January 4, 2026
作者: Loïc Magne, Anas Awadalla, Guanzhi Wang, Yinzhen Xu, Joshua Belofsky, Fengyuan Hu, Joohwan Kim, Ludwig Schmidt, Georgia Gkioxari, Jan Kautz, Yisong Yue, Yejin Choi, Yuke Zhu, Linxi "Jim" Fan
cs.AI

摘要

我们推出NitroGen——一个面向通用游戏智能体的视觉-动作基础模型,该模型基于超过1,000款游戏、总计4万小时的游戏录像训练而成。我们融合了三大核心要素:1)通过自动提取公开游戏视频中的玩家动作构建的网络级视频-动作数据集;2)能够衡量跨游戏泛化能力的多游戏基准环境;3)采用大规模行为克隆训练的统一视觉-动作模型。NitroGen在多个领域展现出强大能力,包括3D动作游戏的战斗场景、2D平台游戏的高精度操控,以及程序生成世界的探索任务。该模型能有效迁移至未训练过的游戏,在任务成功率上相比从头训练的模型最高实现52%的相对提升。我们将公开数据集、评估套件和模型权重,以推动通用具身智能体的研究发展。
English
We introduce NitroGen, a vision-action foundation model for generalist gaming agents that is trained on 40,000 hours of gameplay videos across more than 1,000 games. We incorporate three key ingredients: 1) an internet-scale video-action dataset constructed by automatically extracting player actions from publicly available gameplay videos, 2) a multi-game benchmark environment that can measure cross-game generalization, and 3) a unified vision-action model trained with large-scale behavior cloning. NitroGen exhibits strong competence across diverse domains, including combat encounters in 3D action games, high-precision control in 2D platformers, and exploration in procedurally generated worlds. It transfers effectively to unseen games, achieving up to 52% relative improvement in task success rates over models trained from scratch. We release the dataset, evaluation suite, and model weights to advance research on generalist embodied agents.
PDF221January 8, 2026