ChatPaper.aiChatPaper

TrackGo:一種靈活且高效的可控影片生成方法

TrackGo: A Flexible and Efficient Method for Controllable Video Generation

August 21, 2024
作者: Haitao Zhou, Chuang Wang, Rui Nie, Jinxiao Lin, Dongdong Yu, Qian Yu, Changhu Wang
cs.AI

摘要

近年來,在基於擴散的可控影片生成方面取得了顯著進展。然而,在複雜情境中實現精確控制,包括細粒度物件部分、複雜運動軌跡和連貫的背景移動,仍然是一個挑戰。本文介紹了TrackGo,一種利用自由形式遮罩和箭頭進行條件影片生成的新方法。該方法為使用者提供了一種靈活且精確的機制來操作影片內容。我們還提出了TrackAdapter用於控制實現,這是一個高效且輕量的適配器,旨在無縫集成到預訓練影片生成模型的時間自注意力層中。這種設計利用我們的觀察,即這些層的注意力地圖可以準確激活與影片中運動對應的區域。我們的實驗結果表明,我們的新方法,通過TrackAdapter的增強,實現了在FVD、FID和ObjMC等關鍵指標上的最先進性能。TrackGo的專案頁面位於:https://zhtjtcz.github.io/TrackGo-Page/
English
Recent years have seen substantial progress in diffusion-based controllable video generation. However, achieving precise control in complex scenarios, including fine-grained object parts, sophisticated motion trajectories, and coherent background movement, remains a challenge. In this paper, we introduce TrackGo, a novel approach that leverages free-form masks and arrows for conditional video generation. This method offers users with a flexible and precise mechanism for manipulating video content. We also propose the TrackAdapter for control implementation, an efficient and lightweight adapter designed to be seamlessly integrated into the temporal self-attention layers of a pretrained video generation model. This design leverages our observation that the attention map of these layers can accurately activate regions corresponding to motion in videos. Our experimental results demonstrate that our new approach, enhanced by the TrackAdapter, achieves state-of-the-art performance on key metrics such as FVD, FID, and ObjMC scores. The project page of TrackGo can be found at: https://zhtjtcz.github.io/TrackGo-Page/

Summary

AI-Generated Summary

PDF182November 16, 2024