ChatPaper.aiChatPaper

流体世界:反应-扩散动力学作为世界模型的预测性基底

FluidWorld: Reaction-Diffusion Dynamics as a Predictive Substrate for World Models

March 22, 2026
作者: Fabien Polly
cs.AI

摘要

世界模型通过学习预测环境未来状态,实现规划与心理模拟。当前方法普遍采用基于Transformer的预测器在潜在空间中进行运算,但这带来了双重代价:O(N²)的计算复杂度与显式空间归纳偏置的缺失。本文提出一个基础性质疑:自注意力机制是否为预测性世界建模的必要条件?是否存在替代性计算基质能实现相当或更优的效果?我们提出FluidWorld概念验证模型,其预测动力学由反应-扩散型偏微分方程控制。该模型摒弃独立的神经网络预测器,直接通过PDE积分生成未来状态预测。在无条件UCF-101视频预测任务中(64x64分辨率,约80万参数,采用完全相同的编码器、解码器、损失函数及数据),我们进行了严格参数匹配的三向消融实验:FluidWorld与Transformer基线(自注意力)和ConvLSTM基线(卷积递归)对比。虽然三者均达到相当的单步预测损失,但FluidWorld实现了2倍更低的重构误差,其表征空间结构保持度提升10-15%,有效维度增加18-25%,关键优势在于能保持连贯的多步推演,而两个基线模型均快速退化。所有实验均在单台消费级PC(Intel Core i5, NVIDIA RTX 4070 Ti)上完成,未使用大规模算力。这些结果表明:基于PDE的动力学机制天然具备O(N)空间复杂度、自适应计算能力及通过扩散实现的全局空间一致性,是世界建模中可替代注意力与卷积递归的参效兼顾方案。
English
World models learn to predict future states of an environment, enabling planning and mental simulation. Current approaches default to Transformer-based predictors operating in learned latent spaces. This comes at a cost: O(N^2) computation and no explicit spatial inductive bias. This paper asks a foundational question: is self-attention necessary for predictive world modeling, or can alternative computational substrates achieve comparable or superior results? I introduce FluidWorld, a proof-of-concept world model whose predictive dynamics are governed by partial differential equations (PDEs) of reaction-diffusion type. Instead of using a separate neural network predictor, the PDE integration itself produces the future state prediction. In a strictly parameter-matched three-way ablation on unconditional UCF-101 video prediction (64x64, ~800K parameters, identical encoder, decoder, losses, and data), FluidWorld is compared against both a Transformer baseline (self-attention) and a ConvLSTM baseline (convolutional recurrence). While all three models converge to comparable single-step prediction loss, FluidWorld achieves 2x lower reconstruction error, produces representations with 10-15% higher spatial structure preservation and 18-25% more effective dimensionality, and critically maintains coherent multi-step rollouts where both baselines degrade rapidly. All experiments were conducted on a single consumer-grade PC (Intel Core i5, NVIDIA RTX 4070 Ti), without any large-scale compute. These results establish that PDE-based dynamics, which natively provide O(N) spatial complexity, adaptive computation, and global spatial coherence through diffusion, are a viable and parameter-efficient alternative to both attention and convolutional recurrence for world modeling.
PDF12March 25, 2026