ChatPaper.aiChatPaper

RoPECraft:基於軌跡引導RoPE優化的無訓練運動遷移於擴散變壓器

RoPECraft: Training-Free Motion Transfer with Trajectory-Guided RoPE Optimization on Diffusion Transformers

May 19, 2025
作者: Ahmet Berke Gokmen, Yigit Ekin, Bahri Batuhan Bilecen, Aysegul Dundar
cs.AI

摘要

我們提出了RoPECraft,這是一種無需訓練的視頻運動遷移方法,專為擴散變換器設計,僅通過修改其旋轉位置嵌入(RoPE)來實現。首先,我們從參考視頻中提取密集光流,並利用產生的運動偏移來扭曲RoPE的複指數張量,從而有效地將運動編碼到生成過程中。這些嵌入在去噪時間步長期間通過使用流匹配目標對預測速度與目標速度進行軌跡對齊來進一步優化。為了保持輸出與文本提示一致並防止重複生成,我們引入了一個基於參考視頻傅里葉變換相位分量的正則化項,將相位角投影到平滑流形上以抑制高頻偽影。基準測試的實驗表明,RoPECraft在質量和數量上均優於所有最近發佈的方法。
English
We propose RoPECraft, a training-free video motion transfer method for diffusion transformers that operates solely by modifying their rotary positional embeddings (RoPE). We first extract dense optical flow from a reference video, and utilize the resulting motion offsets to warp the complex-exponential tensors of RoPE, effectively encoding motion into the generation process. These embeddings are then further optimized during denoising time steps via trajectory alignment between the predicted and target velocities using a flow-matching objective. To keep the output faithful to the text prompt and prevent duplicate generations, we incorporate a regularization term based on the phase components of the reference video's Fourier transform, projecting the phase angles onto a smooth manifold to suppress high-frequency artifacts. Experiments on benchmarks reveal that RoPECraft outperforms all recently published methods, both qualitatively and quantitatively.

Summary

AI-Generated Summary

PDF22May 23, 2025