ChatPaper.aiChatPaper

時間和空間的探索性中間插值

Explorative Inbetweening of Time and Space

March 21, 2024
作者: Haiwen Feng, Zheng Ding, Zhihao Xia, Simon Niklaus, Victoria Abrevaya, Michael J. Black, Xuaner Zhang
cs.AI

摘要

我們引入有界生成作為一項廣義任務,用於控制視頻生成,僅基於給定的起始和結束幀來合成任意相機和主題運動。我們的目標是充分利用圖像到視頻模型固有的泛化能力,而無需對原始模型進行額外的訓練或微調。這是通過提出的新採樣策略實現的,我們稱之為時間反轉融合,它融合了在起始和結束幀條件下的時間正向和反向去噪路徑。融合的路徑產生了一個視頻,平滑地連接了兩個幀,生成了忠實主題運動的中間過程,靜態場景的新視圖,以及當兩個邊界幀相同時的無縫視頻循環。我們精心編輯了一個多樣化的評估數據集,其中包含圖像對,並與最接近的現有方法進行比較。我們發現時間反轉融合在所有子任務上均優於相關工作,展現了生成複雜運動和受限幀引導的三維一致視圖的能力。請參閱項目頁面:https://time-reversal.github.io。
English
We introduce bounded generation as a generalized task to control video generation to synthesize arbitrary camera and subject motion based only on a given start and end frame. Our objective is to fully leverage the inherent generalization capability of an image-to-video model without additional training or fine-tuning of the original model. This is achieved through the proposed new sampling strategy, which we call Time Reversal Fusion, that fuses the temporally forward and backward denoising paths conditioned on the start and end frame, respectively. The fused path results in a video that smoothly connects the two frames, generating inbetweening of faithful subject motion, novel views of static scenes, and seamless video looping when the two bounding frames are identical. We curate a diverse evaluation dataset of image pairs and compare against the closest existing methods. We find that Time Reversal Fusion outperforms related work on all subtasks, exhibiting the ability to generate complex motions and 3D-consistent views guided by bounded frames. See project page at https://time-reversal.github.io.

Summary

AI-Generated Summary

PDF131December 15, 2024