SCAIL-2: エンドツーエンドのインコンテキスト条件付けによる制御可能なキャラクターアニメーションの統一

要旨

制御されたキャラクタアニメーションには、駆動シーケンスから参照キャラクタへのモーション転送が必要です。従来の研究は、モーションを表現するためのポーズスケルトンや環境を表現するためのマスクされた背景などの中間表現に大きく依存しており、これにより情報損失が不可避的に生じます。この問題に対処するため、我々はSCAIL-2を提案します。これは、それらの中間表現を介さず、エンドツーエンドのキャラクタアニメーションを実現するフレームワークです。駆動ビデオをシーケンスに直接連結することで、モデルは入力ビデオから必要な視覚情報をすべて取得できます。エンドツーエンドデータの不足に対応するため、キャラクタアニメーションのサブタスクを分離された条件で統合し、異種タスクを含むエンドツーエンドのモーション転送データセットであるMotionPair-60Kを合成するパイプラインを構築しました。この統合を実現するために、テキスト指示や生の視覚情報に加えて、ソフトガイダンスとしてインコンテキストマスク条件付けとモード固有のRoPEを利用します。詳細領域における合成の不一致に対処するため、バイアス認識DPOを提案し、選好項目を構築して誤差を軽減します。広範な実験により、我々の手法が様々なキャラクタアニメーションタスクにおいて既存の最先端手法を大幅に上回ることを示します。合成データの大部分とモデル重みは、プロジェクトページ（https://teal024.github.io/SCAIL-2/）で公開予定です。

English

Controlled character animation requires transferring motion from a driving sequence to a reference character. Prior works heavily rely on intermediate representations, including pose skeletons to represent motion or masked background to represent environment, which inevitably leads to information loss. To address this, we present SCAIL-2, an framework that bypasses those intermediates and achieves end-to-end character animation. By directly concatenating driving videos to the sequence, the model can obtain all the required visual information from the input video. To address lack of end-to-end data, we unify sub-tasks of character animation with decoupled conditions and then curate a pipeline to synthesize MotionPair-60K, an end-to-end motion transfer dataset containing heterogeneous tasks of character animation. To archive the unification, we utilize in-context mask conditioning and mode-specific RoPE as soft guidance beyond textual instructions and raw visual information. To address synthetic discrepancy in detailed regions, we propose Bias-Aware DPO to construct preference items to mitigate the errors. Extensive experiments demonstrate that our method substantially outperforms existing state-of-the-art approaches in various character animation tasks. A large subset of synthetic data as well as model weights will be released at our project page: https://teal024.github.io/SCAIL-2/.