ChatPaper.aiChatPaper

时间和空间的探索性插值

Explorative Inbetweening of Time and Space

March 21, 2024
作者: Haiwen Feng, Zheng Ding, Zhihao Xia, Simon Niklaus, Victoria Abrevaya, Michael J. Black, Xuaner Zhang
cs.AI

摘要

我们引入有界生成作为一种广义任务,用于控制视频生成,仅基于给定的起始帧和结束帧合成任意摄像机和主体运动。我们的目标是充分利用图像到视频模型固有的泛化能力,而无需对原始模型进行额外训练或微调。这是通过提出的新采样策略实现的,我们称之为时间逆转融合,它融合了在起始帧和结束帧上条件化的时间正向和时间反向去噪路径。融合路径产生的视频可以平滑连接这两个帧,生成忠实主体运动的中间帧、静态场景的新视角,以及当两个边界帧相同时无缝视频循环。我们整理了一个多样化的图像对评估数据集,并与最接近的现有方法进行比较。我们发现时间逆转融合在所有子任务上均优于相关工作,展现出生成复杂运动和受有界帧引导的3D一致视图的能力。请查看项目页面:https://time-reversal.github.io。
English
We introduce bounded generation as a generalized task to control video generation to synthesize arbitrary camera and subject motion based only on a given start and end frame. Our objective is to fully leverage the inherent generalization capability of an image-to-video model without additional training or fine-tuning of the original model. This is achieved through the proposed new sampling strategy, which we call Time Reversal Fusion, that fuses the temporally forward and backward denoising paths conditioned on the start and end frame, respectively. The fused path results in a video that smoothly connects the two frames, generating inbetweening of faithful subject motion, novel views of static scenes, and seamless video looping when the two bounding frames are identical. We curate a diverse evaluation dataset of image pairs and compare against the closest existing methods. We find that Time Reversal Fusion outperforms related work on all subtasks, exhibiting the ability to generate complex motions and 3D-consistent views guided by bounded frames. See project page at https://time-reversal.github.io.

Summary

AI-Generated Summary

PDF131December 15, 2024