梦幻世界:视频生成中的统一世界建模
DreamWorld: Unified World Modeling in Video Generation
February 28, 2026
作者: Boming Tan, Xiangdong Zhang, Ning Liao, Yuqing Zhang, Shaofeng Zhang, Xue Yang, Qi Fan, Yanyong Zhang
cs.AI
摘要
尽管视频生成技术已取得显著进展,但现有模型仍局限于表面合理性,缺乏对世界连贯统一的理解。先前的研究方法通常仅融入单一形式的世界相关知识,或依赖僵化的对齐策略引入额外知识。然而,单纯对齐单一世界知识不足以构建需要联合建模多维度异构要素(如物理常识、三维与时间一致性)的世界模型。为解决这一局限,我们提出DreamWorld——一个通过联合世界建模范式将互补性世界知识整合到视频生成器的统一框架,通过联合预测基础模型中的视频像素与特征来捕捉时序动态、空间几何及语义一致性。但直接优化这些异构目标会导致视觉不稳定性和时序闪烁问题。为此,我们提出一致性约束退火算法来在训练过程中渐进调节世界级约束,并采用多源内部引导机制在推理阶段强化已学习的世界先验。大量实验表明,DreamWorld显著提升了世界一致性,在VBench基准上以2.26分优势超越Wan2.1模型。代码将公开于https://github.com/ABU121111/DreamWorld{mypink{Github}}。
English
Despite impressive progress in video generation, existing models remain limited to surface-level plausibility, lacking a coherent and unified understanding of the world. Prior approaches typically incorporate only a single form of world-related knowledge or rely on rigid alignment strategies to introduce additional knowledge. However, aligning the single world knowledge is insufficient to constitute a world model that requires jointly modeling multiple heterogeneous dimensions (e.g., physical commonsense, 3D and temporal consistency). To address this limitation, we introduce DreamWorld, a unified framework that integrates complementary world knowledge into video generators via a Joint World Modeling Paradigm, jointly predicting video pixels and features from foundation models to capture temporal dynamics, spatial geometry, and semantic consistency. However, naively optimizing these heterogeneous objectives can lead to visual instability and temporal flickering. To mitigate this issue, we propose Consistent Constraint Annealing (CCA) to progressively regulate world-level constraints during training, and Multi-Source Inner-Guidance to enforce learned world priors at inference. Extensive evaluations show that DreamWorld improves world consistency, outperforming Wan2.1 by 2.26 points on VBench. Code will be made publicly available at https://github.com/ABU121111/DreamWorld{mypink{Github}}.