FantasyPortrait: 表情拡張型拡散トランスフォーマーによる複数キャラクターポートレートアニメーションの強化

要旨

静止画像から表現力豊かな顔面アニメーションを生成することは、困難な課題である。従来の手法では、明示的な幾何学的プリオール（例：顔のランドマークや3DMM）に依存するため、クロスリエナクトメントにおいてアーティファクトが生じやすく、微妙な感情の捕捉に苦労することが多い。さらに、既存のアプローチでは、複数キャラクターのアニメーションに対応しておらず、異なる個人からの駆動特徴が互いに干渉し、タスクを複雑にしている。これらの課題に対処するため、我々はFantasyPortraitを提案する。これは、単一および複数キャラクターのシナリオにおいて、高忠実度で感情豊かなアニメーションを生成可能な拡散トランスフォーマーベースのフレームワークである。我々の手法では、アイデンティティに依存しない顔面ダイナミクスを捕捉するために、暗黙的表現を活用した表情拡張学習戦略を導入し、微細な感情のレンダリング能力を向上させている。複数キャラクターの制御については、独立しながらも協調的な表情生成を保証するマスク付きクロスアテンションメカニズムを設計し、特徴の干渉を効果的に防止している。この分野の研究を推進するため、我々はMulti-ExprデータセットとExprBenchを提案する。これらは、複数キャラクターポートレートアニメーションのトレーニングと評価に特化したデータセットおよびベンチマークである。大規模な実験により、FantasyPortraitが定量的指標と定性的評価の両方において、最先端の手法を大幅に上回り、特に困難なクロスリエナクトメントや複数キャラクターのコンテキストで優れていることが示された。プロジェクトページはhttps://fantasy-amap.github.io/fantasy-portrait/である。

English

Producing expressive facial animations from static images is a challenging task. Prior methods relying on explicit geometric priors (e.g., facial landmarks or 3DMM) often suffer from artifacts in cross reenactment and struggle to capture subtle emotions. Furthermore, existing approaches lack support for multi-character animation, as driving features from different individuals frequently interfere with one another, complicating the task. To address these challenges, we propose FantasyPortrait, a diffusion transformer based framework capable of generating high-fidelity and emotion-rich animations for both single- and multi-character scenarios. Our method introduces an expression-augmented learning strategy that utilizes implicit representations to capture identity-agnostic facial dynamics, enhancing the model's ability to render fine-grained emotions. For multi-character control, we design a masked cross-attention mechanism that ensures independent yet coordinated expression generation, effectively preventing feature interference. To advance research in this area, we propose the Multi-Expr dataset and ExprBench, which are specifically designed datasets and benchmarks for training and evaluating multi-character portrait animations. Extensive experiments demonstrate that FantasyPortrait significantly outperforms state-of-the-art methods in both quantitative metrics and qualitative evaluations, excelling particularly in challenging cross reenactment and multi-character contexts. Our project page is https://fantasy-amap.github.io/fantasy-portrait/.

FantasyPortrait: 表情拡張型拡散トランスフォーマーによる複数キャラクターポートレートアニメーションの強化

FantasyPortrait: Enhancing Multi-Character Portrait Animation with Expression-Augmented Diffusion Transformers

要旨

Support