MotionCLR:通过理解注意力机制实现动作生成和无需训练的编辑
MotionCLR: Motion Generation and Training-free Editing via Understanding Attention Mechanisms
October 24, 2024
作者: Ling-Hao Chen, Wenxun Dai, Xuan Ju, Shunlin Lu, Lei Zhang
cs.AI
摘要
本研究探讨了人体动作生成的交互式编辑问题。先前的动作扩散模型缺乏明确建模单词级文本-动作对应关系和良好的可解释性,从而限制了其细粒度编辑能力。为解决这一问题,我们提出了一种基于注意力的动作扩散模型,即MotionCLR,具有清晰建模注意力机制。在技术上,MotionCLR 使用自注意力和交叉注意力分别对模态内和模态间的交互进行建模。具体而言,自注意力机制旨在衡量帧之间的顺序相似性并影响动作特征的顺序。相比之下,交叉注意力机制旨在找到细粒度的单词序列对应关系,并激活动作序列中相应的时间步。基于这些关键特性,我们通过操纵注意力图开发了一套简单而有效的多功能动作编辑方法,如动作(去)强调、原地动作替换和基于示例的动作生成等。为进一步验证注意力机制的可解释性,我们还通过注意力图探索了动作计数和基于实例的动作生成能力的潜力。我们的实验结果表明,我们的方法具有良好的生成和编辑能力,并具有良好的可解释性。
English
This research delves into the problem of interactive editing of human motion
generation. Previous motion diffusion models lack explicit modeling of the
word-level text-motion correspondence and good explainability, hence
restricting their fine-grained editing ability. To address this issue, we
propose an attention-based motion diffusion model, namely MotionCLR, with CLeaR
modeling of attention mechanisms. Technically, MotionCLR models the in-modality
and cross-modality interactions with self-attention and cross-attention,
respectively. More specifically, the self-attention mechanism aims to measure
the sequential similarity between frames and impacts the order of motion
features. By contrast, the cross-attention mechanism works to find the
fine-grained word-sequence correspondence and activate the corresponding
timesteps in the motion sequence. Based on these key properties, we develop a
versatile set of simple yet effective motion editing methods via manipulating
attention maps, such as motion (de-)emphasizing, in-place motion replacement,
and example-based motion generation, etc. For further verification of the
explainability of the attention mechanism, we additionally explore the
potential of action-counting and grounded motion generation ability via
attention maps. Our experimental results show that our method enjoys good
generation and editing ability with good explainability.Summary
AI-Generated Summary