ChatPaper.aiChatPaper

ReCapture:使用遮罩式影片微調的生成式攝影機控制用戶提供的影片

ReCapture: Generative Video Camera Controls for User-Provided Videos using Masked Video Fine-Tuning

November 7, 2024
作者: David Junhao Zhang, Roni Paiss, Shiran Zada, Nikhil Karnad, David E. Jacobs, Yael Pritch, Inbar Mosseri, Mike Zheng Shou, Neal Wadhwa, Nataniel Ruiz
cs.AI

摘要

最近,影片建模方面的突破使得在生成的影片中實現可控制的相機軌跡成為可能。然而,這些方法無法直接應用於非由影片模型生成的用戶提供的影片。本文提出了一種名為 ReCapture 的方法,用於從單個用戶提供的影片生成具有新相機軌跡的新影片。我們的方法允許我們以截然不同的角度和具有電影相機運動重新生成參考影片,並保留所有現有場景運動。值得注意的是,使用我們的方法,我們還可以合理地幻想出參考影片中未能觀察到的場景部分。我們的方法通過以下步驟實現:(1) 使用多視圖擴散模型或基於深度的點雲渲染生成具有新相機軌跡的噪聲錨點影片,然後 (2) 利用我們提出的遮罩式影片微調技術,將錨點影片再生為乾淨且時間上一致的重新角度影片。
English
Recently, breakthroughs in video modeling have allowed for controllable camera trajectories in generated videos. However, these methods cannot be directly applied to user-provided videos that are not generated by a video model. In this paper, we present ReCapture, a method for generating new videos with novel camera trajectories from a single user-provided video. Our method allows us to re-generate the reference video, with all its existing scene motion, from vastly different angles and with cinematic camera motion. Notably, using our method we can also plausibly hallucinate parts of the scene that were not observable in the reference video. Our method works by (1) generating a noisy anchor video with a new camera trajectory using multiview diffusion models or depth-based point cloud rendering and then (2) regenerating the anchor video into a clean and temporally consistent reangled video using our proposed masked video fine-tuning technique.
PDF715November 13, 2024