ChatPaper.aiChatPaper

AnimateLCM:加速個性化擴散模型和適配器動畫的解耦一致性學習

AnimateLCM: Accelerating the Animation of Personalized Diffusion Models and Adapters with Decoupled Consistency Learning

February 1, 2024
作者: Fu-Yun Wang, Zhaoyang Huang, Xiaoyu Shi, Weikang Bian, Guanglu Song, Yu Liu, Hongsheng Li
cs.AI

摘要

視頻擴散模型因其能夠生成既連貫又高保真度的視頻而受到越來越多的關注。然而,迭代去噪過程使其計算密集且耗時,從而限制了其應用。受一致性模型(CM)的啟發,該模型將預訓練的圖像擴散模型提煉出來以加速採樣並減少步驟,以及其成功擴展的潛在一致性模型(LCM)用於條件圖像生成,我們提出了 AnimateLCM,實現高保真度視頻生成並減少步驟。我們提出了一種分離一致性學習策略,而非直接在原始視頻數據集上進行一致性學習,該策略將圖像生成先驗和運動生成先驗的提煉分離開來,從而提高了訓練效率並增強了生成的視覺質量。此外,為了實現在穩定擴散社區中組合即插即用的適配器以實現各種功能(例如,用於可控生成的 ControlNet),我們提出了一種有效策略,將現有的適配器適應到我們提煉的文本條件視頻一致性模型上,或從頭開始訓練適配器而不影響採樣速度。我們在圖像條件視頻生成和佈局條件視頻生成中驗證了所提出的策略,並取得了頂尖成績。實驗結果驗證了我們提出方法的有效性。代碼和權重將公開發布。更多詳細信息請參見 https://github.com/G-U-N/AnimateLCM。
English
Video diffusion models has been gaining increasing attention for its ability to produce videos that are both coherent and of high fidelity. However, the iterative denoising process makes it computationally intensive and time-consuming, thus limiting its applications. Inspired by the Consistency Model (CM) that distills pretrained image diffusion models to accelerate the sampling with minimal steps and its successful extension Latent Consistency Model (LCM) on conditional image generation, we propose AnimateLCM, allowing for high-fidelity video generation within minimal steps. Instead of directly conducting consistency learning on the raw video dataset, we propose a decoupled consistency learning strategy that decouples the distillation of image generation priors and motion generation priors, which improves the training efficiency and enhance the generation visual quality. Additionally, to enable the combination of plug-and-play adapters in stable diffusion community to achieve various functions (e.g., ControlNet for controllable generation). we propose an efficient strategy to adapt existing adapters to our distilled text-conditioned video consistency model or train adapters from scratch without harming the sampling speed. We validate the proposed strategy in image-conditioned video generation and layout-conditioned video generation, all achieving top-performing results. Experimental results validate the effectiveness of our proposed method. Code and weights will be made public. More details are available at https://github.com/G-U-N/AnimateLCM.
PDF232December 15, 2024