ChatPaper.aiChatPaper

Cinemo:使用運動擴散模型實現一致且可控的影像動畫

Cinemo: Consistent and Controllable Image Animation with Motion Diffusion Models

July 22, 2024
作者: Xin Ma, Yaohui Wang, Gengyu Jia, Xinyuan Chen, Yuan-Fang Li, Cunjian Chen, Yu Qiao
cs.AI

摘要

擴散模型在圖像動畫方面取得了巨大進展,這要歸功於其強大的生成能力。然而,隨著時間的推移,保持與輸入靜態圖像的詳細信息(例如風格、背景和物體)的時空一致性,以及確保根據文本提示引導的動畫視頻敘事的流暢性仍然具有挑戰性。在本文中,我們介紹了 Cinemo,這是一種新穎的圖像動畫方法,旨在實現更好的運動可控性,以及更強的時空一致性和流暢性。總的來說,我們提出了三種有效策略,用於 Cinemo 的訓練和推斷階段,以實現我們的目標。在訓練階段,Cinemo 著重於學習運動殘差的分佈,而不是通過運動擴散模型直接預測後續的運動。此外,提出了一種基於結構相似性指數的策略,以實現 Cinemo 對運動強度具有更好的可控性。在推斷階段,引入了一種基於離散余弦變換的噪聲精煉技術,以減輕突然的運動變化。這三種策略使 Cinemo 能夠產生高度一致、流暢和可控的結果。與先前的方法相比,Cinemo 提供了更簡單和更精確的用戶可控性。通過與幾種最先進的方法進行廣泛實驗,包括商業工具和研究方法,在多個指標上展示了我們提出方法的有效性和優越性。
English
Diffusion models have achieved great progress in image animation due to powerful generative capabilities. However, maintaining spatio-temporal consistency with detailed information from the input static image over time (e.g., style, background, and object of the input static image) and ensuring smoothness in animated video narratives guided by textual prompts still remains challenging. In this paper, we introduce Cinemo, a novel image animation approach towards achieving better motion controllability, as well as stronger temporal consistency and smoothness. In general, we propose three effective strategies at the training and inference stages of Cinemo to accomplish our goal. At the training stage, Cinemo focuses on learning the distribution of motion residuals, rather than directly predicting subsequent via a motion diffusion model. Additionally, a structural similarity index-based strategy is proposed to enable Cinemo to have better controllability of motion intensity. At the inference stage, a noise refinement technique based on discrete cosine transformation is introduced to mitigate sudden motion changes. Such three strategies enable Cinemo to produce highly consistent, smooth, and motion-controllable results. Compared to previous methods, Cinemo offers simpler and more precise user controllability. Extensive experiments against several state-of-the-art methods, including both commercial tools and research approaches, across multiple metrics, demonstrate the effectiveness and superiority of our proposed approach.

Summary

AI-Generated Summary

PDF112November 28, 2024