軌跡創建者:基於擴散模型的單目視頻相機軌跡重定向
TrajectoryCrafter: Redirecting Camera Trajectory for Monocular Videos via Diffusion Models
March 7, 2025
作者: Mark YU, Wenbo Hu, Jinbo Xing, Ying Shan
cs.AI
摘要
我們提出了TrajectoryCrafter,這是一種針對單目視頻重新定向相機軌跡的新穎方法。通過將確定性的視圖變換與隨機內容生成分離,我們的方法實現了對用戶指定相機軌跡的精確控制。我們提出了一種新穎的雙流條件視頻擴散模型,該模型同時整合點雲渲染和源視頻作為條件,確保了準確的視圖變換和連貫的4D內容生成。我們沒有利用稀缺的多視角視頻,而是通過創新的雙重重投影策略,策劃了一個結合網絡規模單目視頻與靜態多視角數據集的混合訓練數據集,顯著促進了跨多樣場景的魯棒泛化能力。在多視角和大規模單目視頻上的廣泛評估展示了我們方法的卓越性能。
English
We present TrajectoryCrafter, a novel approach to redirect camera
trajectories for monocular videos. By disentangling deterministic view
transformations from stochastic content generation, our method achieves precise
control over user-specified camera trajectories. We propose a novel dual-stream
conditional video diffusion model that concurrently integrates point cloud
renders and source videos as conditions, ensuring accurate view transformations
and coherent 4D content generation. Instead of leveraging scarce multi-view
videos, we curate a hybrid training dataset combining web-scale monocular
videos with static multi-view datasets, by our innovative double-reprojection
strategy, significantly fostering robust generalization across diverse scenes.
Extensive evaluations on multi-view and large-scale monocular videos
demonstrate the superior performance of our method.Summary
AI-Generated Summary