ChatPaper.aiChatPaper

DualCamCtrl:基於雙分支擴散模型的幾何感知相機控制影片生成

DualCamCtrl: Dual-Branch Diffusion Model for Geometry-Aware Camera-Controlled Video Generation

November 28, 2025
作者: Hongfei Zhang, Kanghao Chen, Zixin Zhang, Harold Haodong Chen, Yuanhuiyi Lyu, Yuqi Zhang, Shuai Yang, Kun Zhou, Yingcong Chen
cs.AI

摘要

本文提出DualCamCtrl——一種新穎的端到端擴散模型,專注於相機控制視訊生成。儘管近期研究通過將相機姿態表示為光線條件推動了該領域發展,但這些方法往往缺乏足夠的場景理解與幾何感知能力。DualCamCtrl針對此局限性設計了雙分支架構,能同步生成相機視角一致的RGB序列與深度序列。為協調這兩種模態,我們進一步提出語義引導互斥對齊機制(SIGMA),以語義引導與互斥強化的方式實現RGB-深度融合。這些設計使DualCamCtrl能更有效解耦外觀與幾何建模,生成更精準遵循指定相機軌跡的視訊。此外,我們分析並揭示了深度與相機姿態在去噪階段的差異化影響,證實早期與晚期階段在構建全局結構與優化局部細節方面具有互補作用。大量實驗表明,DualCamCtrl可實現更一致的相機控制視訊生成,相機運動誤差較現有方法降低逾40%。專案頁面:https://soyouthinkyoucantell.github.io/dualcamctrl-page/
English
This paper presents DualCamCtrl, a novel end-to-end diffusion model for camera-controlled video generation. Recent works have advanced this field by representing camera poses as ray-based conditions, yet they often lack sufficient scene understanding and geometric awareness. DualCamCtrl specifically targets this limitation by introducing a dual-branch framework that mutually generates camera-consistent RGB and depth sequences. To harmonize these two modalities, we further propose the Semantic Guided Mutual Alignment (SIGMA) mechanism, which performs RGB-depth fusion in a semantics-guided and mutually reinforced manner. These designs collectively enable DualCamCtrl to better disentangle appearance and geometry modeling, generating videos that more faithfully adhere to the specified camera trajectories. Additionally, we analyze and reveal the distinct influence of depth and camera poses across denoising stages and further demonstrate that early and late stages play complementary roles in forming global structure and refining local details. Extensive experiments demonstrate that DualCamCtrl achieves more consistent camera-controlled video generation, with over 40\% reduction in camera motion errors compared with prior methods. Our project page: https://soyouthinkyoucantell.github.io/dualcamctrl-page/
PDF381December 4, 2025