ChatPaper.aiChatPaper

ReCamDriving:無需LiDAR的攝影機控制新型軌跡影片生成技術

ReCamDriving: LiDAR-Free Camera-Controlled Novel Trajectory Video Generation

December 3, 2025
作者: Yaokun Li, Shuaixian Wang, Mantang Guo, Jiehui Huang, Taojun Ding, Mu Hu, Kaixuan Wang, Shaojie Shen, Guang Tan
cs.AI

摘要

我們提出ReCamDriving——一個純基於視覺、相機控制的新型軌跡影片生成框架。相較於修復式方法難以還原複雜偽影,以及雷射雷達方法依賴稀疏且不完整的線索,ReCamDriving利用稠密且場景完整的3DGS渲染結果提供顯式幾何引導,實現精確的相機可控生成。為緩解基於3DGS渲染條件訓練時對修復行為的過擬合問題,本框架採用兩階段訓練範式:第一階段使用相機姿態進行粗略控制,第二階段則引入3DGS渲染實現細粒度視角與幾何引導。此外,我們提出基於3DGS的跨軌跡資料構建策略,消除相機變換模式在訓練與測試階段的差異,從而實現單目影片的可擴展多軌跡監督。基於此策略,我們構建了包含超過11萬組平行軌跡影片對的ParaDrive資料集。大量實驗表明,ReCamDriving在相機控制精度與結構一致性方面均達到最先進水準。
English
We propose ReCamDriving, a purely vision-based, camera-controlled novel-trajectory video generation framework. While repair-based methods fail to restore complex artifacts and LiDAR-based approaches rely on sparse and incomplete cues, ReCamDriving leverages dense and scene-complete 3DGS renderings for explicit geometric guidance, achieving precise camera-controllable generation. To mitigate overfitting to restoration behaviors when conditioned on 3DGS renderings, ReCamDriving adopts a two-stage training paradigm: the first stage uses camera poses for coarse control, while the second stage incorporates 3DGS renderings for fine-grained viewpoint and geometric guidance. Furthermore, we present a 3DGS-based cross-trajectory data curation strategy to eliminate the train-test gap in camera transformation patterns, enabling scalable multi-trajectory supervision from monocular videos. Based on this strategy, we construct the ParaDrive dataset, containing over 110K parallel-trajectory video pairs. Extensive experiments demonstrate that ReCamDriving achieves state-of-the-art camera controllability and structural consistency.
PDF72December 10, 2025