ChatPaper.aiChatPaper

SCOPE:在可玩环境中模拟跨游戏操作以构建FPS世界模型

SCOPE: Simulating Cross-game Operations in Playable Environments for FPS World Models

May 22, 2026
作者: Zizhao Tong, Hongfeng Lai, Zeqing Wang, Zhaohu Xing, Kexu Cheng, Haoran Xu, Zhao Pu, Shangwen Zhu, Ruili Feng, Jian Zhao, Yan Zhang, Hao Tang, Yeying Jin, Ling Shao
cs.AI

摘要

用于第一人称射击(FPS)游戏的交互式世界模型必须在每一帧中解析高频重叠的控制信号,同时不干扰未受影响区域。现有方法全局注入动作并在单一游戏上训练,难以应对密集的FPS输入。我们观察到FPS动作具有空间选择性:开火或换弹等离散事件仅影响武器周围的局部区域(作用范围),而连续的相机和移动信号则控制稳定的周边环境。我们提出SCOPE模型,在预训练视频扩散模型的每个Transformer模块中插入一个条件模块。该模块将特征重塑为逐像素时间序列,使每个位置根据本地视觉内容计算其动作响应,从而无需分割标签即可将作用范围内外的生成过程分离。我们还引入了CrossFPS——首个多游戏FPS数据集,包含帧对齐的动作遥测数据。该数据集涵盖7款游戏的6.9万段片段,具有10自由度控制器信号,经筛选消除了游戏玩法偏差。该模型学习通用的视觉-动作映射而非特定游戏模式,实现了对未见场景的零样本迁移。实验证实了强大的动作响应性、精确的作用范围分离以及有效的跨游戏泛化能力。
English
Interactive world models for first-person shooter (FPS) games must resolve high-frequency overlapping control signals at every frame without disrupting unaffected regions. Existing methods inject actions globally and train on single titles, failing under dense FPS inputs. We observe that FPS actions are spatially selective: discrete events such as firing or reloading affect only a localized region around the weapon (the scope), while continuous camera and movement signals govern stable surroundings. We propose SCOPE, which inserts a conditioning module into each transformer block of a pretrained video diffusion model. It reshapes features into per-pixel temporal sequences so that each position computes its action response from local visual content. This separates in-scope effects from out-of-scope generation without segmentation labels. We also introduce CrossFPS, the first multi-game FPS dataset with frame-aligned action telemetry. It comprises 69K clips from 7 titles with 10-DoF controller signals, curated to remove gameplay bias. The model learns general visual-to-action mappings rather than game-specific patterns, enabling zero-shot transfer to unseen scenes. Experiments confirm strong action responsiveness, precise scope separation, and effective cross-game generalization.