FantasyPortrait:透過表情增強擴散變換器提升多角色肖像動畫
FantasyPortrait: Enhancing Multi-Character Portrait Animation with Expression-Augmented Diffusion Transformers
July 17, 2025
作者: Qiang Wang, Mengchao Wang, Fan Jiang, Yaqi Fan, Yonggang Qi, Mu Xu
cs.AI
摘要
從靜態圖像生成富有表現力的面部動畫是一項具有挑戰性的任務。先前依賴於顯式幾何先驗(如面部標誌點或3DMM)的方法,在跨角色重現時常出現偽影,且難以捕捉細微的情感。此外,現有方法缺乏對多角色動畫的支持,因為來自不同個體的驅動特徵經常相互干擾,使任務複雜化。為應對這些挑戰,我們提出了FantasyPortrait,這是一個基於擴散變換器的框架,能夠為單一及多角色場景生成高保真且情感豐富的動畫。我們的方法引入了一種表情增強學習策略,利用隱式表徵來捕捉與身份無關的面部動態,從而提升模型渲染細膩情感的能力。針對多角色控制,我們設計了一種掩碼交叉注意力機制,確保獨立而協調的表情生成,有效防止特徵干擾。為推動該領域的研究,我們提出了Multi-Expr數據集和ExprBench,這些是專門為訓練和評估多角色肖像動畫設計的數據集和基準。大量實驗表明,FantasyPortrait在定量指標和定性評估上均顯著優於現有最先進的方法,尤其在具有挑戰性的跨角色重現和多角色情境中表現出色。我們的項目頁面是https://fantasy-amap.github.io/fantasy-portrait/。
English
Producing expressive facial animations from static images is a challenging
task. Prior methods relying on explicit geometric priors (e.g., facial
landmarks or 3DMM) often suffer from artifacts in cross reenactment and
struggle to capture subtle emotions. Furthermore, existing approaches lack
support for multi-character animation, as driving features from different
individuals frequently interfere with one another, complicating the task. To
address these challenges, we propose FantasyPortrait, a diffusion transformer
based framework capable of generating high-fidelity and emotion-rich animations
for both single- and multi-character scenarios. Our method introduces an
expression-augmented learning strategy that utilizes implicit representations
to capture identity-agnostic facial dynamics, enhancing the model's ability to
render fine-grained emotions. For multi-character control, we design a masked
cross-attention mechanism that ensures independent yet coordinated expression
generation, effectively preventing feature interference. To advance research in
this area, we propose the Multi-Expr dataset and ExprBench, which are
specifically designed datasets and benchmarks for training and evaluating
multi-character portrait animations. Extensive experiments demonstrate that
FantasyPortrait significantly outperforms state-of-the-art methods in both
quantitative metrics and qualitative evaluations, excelling particularly in
challenging cross reenactment and multi-character contexts. Our project page is
https://fantasy-amap.github.io/fantasy-portrait/.