MotionLab:通過運動條件運動範式實現統一的人類動作生成和編輯
MotionLab: Unified Human Motion Generation and Editing via the Motion-Condition-Motion Paradigm
February 4, 2025
作者: Ziyan Guo, Zeyu Hu, Na Zhao, De Wen Soh
cs.AI
摘要
人類動作生成和編輯是電腦圖形和視覺的關鍵組成部分。然而,目前在這一領域的方法往往提供針對特定任務量身定制的孤立解決方案,這可能對現實應用來說效率低下且不切實際。雖然一些努力旨在統一與動作相關的任務,但這些方法僅僅使用不同形式的條件作為引導動作生成的條件。因此,它們缺乏編輯能力、細粒度控制,並且無法促進跨任務的知識共享。為了解決這些限制並提供一個能夠處理人類動作生成和編輯的多功能統一框架,我們引入了一個新的範式:運動條件運動,它能夠統一制定多樣任務,包括三個概念:源動作、條件和目標動作。基於這個範式,我們提出了一個統一框架MotionLab,該框架將校正流整合進來,以學習從源動作到目標動作的映射,並受指定條件的引導。在MotionLab中,我們引入了1)MotionFlow Transformer來增強有條件生成和編輯,而無需特定任務模塊;2)對齊旋轉位置編碼以確保源動作和目標動作之間的時間同步;3)任務指定指令調節;以及4)運動課程學習,以實現有效的多任務學習和跨任務的知識共享。值得注意的是,我們的MotionLab展示了在多個人類動作基準測試中具有潛力的泛化能力和推理效率。我們的代碼和額外的視頻結果可在以下網址找到:https://diouo.github.io/motionlab.github.io/。
English
Human motion generation and editing are key components of computer graphics
and vision. However, current approaches in this field tend to offer isolated
solutions tailored to specific tasks, which can be inefficient and impractical
for real-world applications. While some efforts have aimed to unify
motion-related tasks, these methods simply use different modalities as
conditions to guide motion generation. Consequently, they lack editing
capabilities, fine-grained control, and fail to facilitate knowledge sharing
across tasks. To address these limitations and provide a versatile, unified
framework capable of handling both human motion generation and editing, we
introduce a novel paradigm: Motion-Condition-Motion, which enables the unified
formulation of diverse tasks with three concepts: source motion, condition, and
target motion. Based on this paradigm, we propose a unified framework,
MotionLab, which incorporates rectified flows to learn the mapping from source
motion to target motion, guided by the specified conditions. In MotionLab, we
introduce the 1) MotionFlow Transformer to enhance conditional generation and
editing without task-specific modules; 2) Aligned Rotational Position Encoding}
to guarantee the time synchronization between source motion and target motion;
3) Task Specified Instruction Modulation; and 4) Motion Curriculum Learning for
effective multi-task learning and knowledge sharing across tasks. Notably, our
MotionLab demonstrates promising generalization capabilities and inference
efficiency across multiple benchmarks for human motion. Our code and additional
video results are available at: https://diouo.github.io/motionlab.github.io/.Summary
AI-Generated Summary