ポニメーター：多様な人間同士のインタラクションアニメーションのためのインタラクティブポーズ展開

要旨

近接した人間同士のインタラクティブなポーズは、相互作用のダイナミクスに関する豊かな文脈情報を伝達する。このようなポーズが与えられると、人間は直感的にその文脈を推測し、過去および未来のダイナミクスを予測することができる。これは、人間の行動に関する強い事前知識に基づいている。この観察に着想を得て、我々はPonimatorを提案する。これは、多様なインタラクションアニメーションのための、近接インタラクティブポーズに基づいたシンプルなフレームワークである。我々のトレーニングデータは、モーションキャプチャインタラクションデータセットから得られた、密接な接触を持つ二人のポーズとその周囲の時間的文脈から構成されている。インタラクティブポーズの事前知識を活用し、Ponimatorは二つの条件付き拡散モデルを採用している：(1) 時間的事前知識を用いてインタラクティブポーズから動的なモーションシーケンスを生成するポーズアニメーター、(2) 空間的事前知識を適用して、インタラクティブポーズが利用できない場合に、単一のポーズ、テキスト、またはその両方からインタラクティブポーズを合成するポーズジェネレーター。全体として、Ponimatorは、画像ベースのインタラクションアニメーション、リアクションアニメーション、テキストからインタラクションへの合成など、多様なタスクをサポートし、高品質なモーションキャプチャデータからオープンワールドシナリオへのインタラクション知識の転移を容易にする。多様なデータセットとアプリケーションにわたる実証実験は、ポーズの事前知識の普遍性と、我々のフレームワークの有効性および堅牢性を示している。

English

Close-proximity human-human interactive poses convey rich contextual information about interaction dynamics. Given such poses, humans can intuitively infer the context and anticipate possible past and future dynamics, drawing on strong priors of human behavior. Inspired by this observation, we propose Ponimator, a simple framework anchored on proximal interactive poses for versatile interaction animation. Our training data consists of close-contact two-person poses and their surrounding temporal context from motion-capture interaction datasets. Leveraging interactive pose priors, Ponimator employs two conditional diffusion models: (1) a pose animator that uses the temporal prior to generate dynamic motion sequences from interactive poses, and (2) a pose generator that applies the spatial prior to synthesize interactive poses from a single pose, text, or both when interactive poses are unavailable. Collectively, Ponimator supports diverse tasks, including image-based interaction animation, reaction animation, and text-to-interaction synthesis, facilitating the transfer of interaction knowledge from high-quality mocap data to open-world scenarios. Empirical experiments across diverse datasets and applications demonstrate the universality of the pose prior and the effectiveness and robustness of our framework.

ポニメーター：多様な人間同士のインタラクションアニメーションのためのインタラクティブポーズ展開

Ponimator: Unfolding Interactive Pose for Versatile Human-human Interaction Animation

要旨

Support