运动的脉搏:从视觉动态中测量物理帧率
The Pulse of Motion: Measuring Physical Frame Rate from Visual Dynamics
March 15, 2026
作者: Xiangbo Gao, Mingyang Wu, Siyuan Yang, Jiongze Yu, Pardis Taghavi, Fangzhou Lin, Zhengzhong Tu
cs.AI
摘要
尽管近期生成式视频模型已实现惊人的视觉真实度,并被探索作为世界模拟器,但真正的物理模拟需同时掌握空间与时间维度。现有模型能生成视觉流畅的运动学效果,却缺乏可靠的内在运动节拍来将这些动作锚定于统一且符合现实世界时间尺度的框架中。这种时间模糊性源于现行普遍做法:对真实世界速度差异巨大的视频进行无差别训练,并将其强制统一至标准化帧率。这导致我们称之为"计时幻觉"的现象:生成序列展现出模糊、不稳定且不可控的物理运动速度。
为解决此问题,我们提出视觉计时器(Visual Chronometer),该预测器可直接从输入视频的视觉动态中还原物理帧率(PhyFPS)。通过受控时间重采样训练,我们的方法能基于运动本身推断真实时间尺度,绕开不可靠的元数据。为系统量化该问题,我们建立了PhyFPS-Bench-Real和PhyFPS-Bench-Gen两个基准测试。评估结果揭示了一个严峻现实:顶尖视频生成器存在严重的物理帧率错位与时间不稳定性。最后我们证明,应用物理帧率校正能显著提升AI生成视频的人类感知自然度。项目页面详见https://xiangbogaobarry.github.io/Visual_Chronometer/。
English
While recent generative video models have achieved remarkable visual realism and are being explored as world models, true physical simulation requires mastering both space and time. Current models can produce visually smooth kinematics, yet they lack a reliable internal motion pulse to ground these motions in a consistent, real-world time scale. This temporal ambiguity stems from the common practice of indiscriminately training on videos with vastly different real-world speeds, forcing them into standardized frame rates. This leads to what we term chronometric hallucination: generated sequences exhibit ambiguous, unstable, and uncontrollable physical motion speeds. To address this, we propose Visual Chronometer, a predictor that recovers the Physical Frames Per Second (PhyFPS) directly from the visual dynamics of an input video. Trained via controlled temporal resampling, our method estimates the true temporal scale implied by the motion itself, bypassing unreliable metadata. To systematically quantify this issue, we establish two benchmarks, PhyFPS-Bench-Real and PhyFPS-Bench-Gen. Our evaluations reveal a harsh reality: state-of-the-art video generators suffer from severe PhyFPS misalignment and temporal instability. Finally, we demonstrate that applying PhyFPS corrections significantly improves the human-perceived naturalness of AI-generated videos. Our project page is https://xiangbogaobarry.github.io/Visual_Chronometer/.