TEDi: 長期モーション合成のための時間的エンタングルメント拡散モデル

要旨

サンプルを小さな増分で合成する拡散プロセスの漸進的な性質は、Denoising Diffusion Probabilistic Models（DDPM）の重要な要素であり、画像合成において前例のない品質を実現し、最近ではモーションドメインでも探求されています。本研究では、この漸進的な拡散の概念（拡散時間軸に沿って動作する）をモーションシーケンスの時間軸に適応させることを提案します。私たちの重要なアイデアは、DDPMフレームワークを拡張して時間的に変化するノイズ除去をサポートし、それによって2つの軸を絡み合わせることです。特別な定式化を用いて、私たちは次第にノイズが増加するポーズのセットを含むモーションバッファを反復的にノイズ除去し、任意の長さのフレームストリームを自己回帰的に生成します。静止した拡散時間軸を使用して、各拡散ステップでモーションの時間軸のみを増分し、フレームワークが新しいクリーンフレームを生成してバッファの先頭から削除し、その後新たに描画されたノイズベクトルを末尾に追加します。この新しいメカニズムは、キャラクターアニメーションやその他のドメインへの応用が可能な、長期的なモーション合成のための新しいフレームワークへの道を開きます。

English

The gradual nature of a diffusion process that synthesizes samples in small increments constitutes a key ingredient of Denoising Diffusion Probabilistic Models (DDPM), which have presented unprecedented quality in image synthesis and been recently explored in the motion domain. In this work, we propose to adapt the gradual diffusion concept (operating along a diffusion time-axis) into the temporal-axis of the motion sequence. Our key idea is to extend the DDPM framework to support temporally varying denoising, thereby entangling the two axes. Using our special formulation, we iteratively denoise a motion buffer that contains a set of increasingly-noised poses, which auto-regressively produces an arbitrarily long stream of frames. With a stationary diffusion time-axis, in each diffusion step we increment only the temporal-axis of the motion such that the framework produces a new, clean frame which is removed from the beginning of the buffer, followed by a newly drawn noise vector that is appended to it. This new mechanism paves the way towards a new framework for long-term motion synthesis with applications to character animation and other domains.

TEDi: 長期モーション合成のための時間的エンタングルメント拡散モデル

TEDi: Temporally-Entangled Diffusion for Long-Term Motion Synthesis

要旨

Support