ChatPaper.aiChatPaper

从虚拟游戏到现实世界游戏

From Virtual Games to Real-World Play

June 23, 2025
作者: Wenqiang Sun, Fangyun Wei, Jinjing Zhao, Xi Chen, Zilong Chen, Hongyang Zhang, Jun Zhang, Yan Lu
cs.AI

摘要

我们推出RealPlay,这是一款基于神经网络的真实世界游戏引擎,能够根据用户控制信号生成交互式视频。与以往专注于游戏风格视觉效果的研究不同,RealPlay旨在生成逼真且时间上连贯的视频序列,使其酷似真实世界的影像。它运行于一个交互循环中:用户观察生成的场景,发出控制指令,随后获得一段简短的视频片段作为响应。为实现如此真实且响应迅速的生成效果,我们攻克了多项关键挑战,包括低延迟反馈的迭代式片段预测、跨迭代的时间一致性以及精确的控制响应。RealPlay的训练结合了标注的游戏数据与未标注的真实世界视频,无需真实世界动作注释。值得注意的是,我们观察到了两种形式的泛化能力:(1) 控制迁移——RealPlay能有效将虚拟场景中的控制信号映射到真实世界情境;(2) 实体迁移——尽管训练标签仅来源于赛车游戏,RealPlay却能泛化至控制包括自行车和行人在内的多种真实世界实体,超越了仅对车辆的控制。项目页面请访问:https://wenqsun.github.io/RealPlay/
English
We introduce RealPlay, a neural network-based real-world game engine that enables interactive video generation from user control signals. Unlike prior works focused on game-style visuals, RealPlay aims to produce photorealistic, temporally consistent video sequences that resemble real-world footage. It operates in an interactive loop: users observe a generated scene, issue a control command, and receive a short video chunk in response. To enable such realistic and responsive generation, we address key challenges including iterative chunk-wise prediction for low-latency feedback, temporal consistency across iterations, and accurate control response. RealPlay is trained on a combination of labeled game data and unlabeled real-world videos, without requiring real-world action annotations. Notably, we observe two forms of generalization: (1) control transfer-RealPlay effectively maps control signals from virtual to real-world scenarios; and (2) entity transfer-although training labels originate solely from a car racing game, RealPlay generalizes to control diverse real-world entities, including bicycles and pedestrians, beyond vehicles. Project page can be found: https://wenqsun.github.io/RealPlay/
PDF81June 24, 2025