ChatPaper.aiChatPaper

PersonaLive! 直播用表情豐富的肖像影像動畫技術

PersonaLive! Expressive Portrait Image Animation for Live Streaming

December 12, 2025
作者: Zhiyuan Li, Chi-Man Pun, Chen Fang, Jue Wang, Xiaodong Cun
cs.AI

摘要

當前基於擴散模型的肖像動畫模型主要聚焦於提升視覺品質與表情真實性,卻忽略了生成延遲與即時性能,這限制了其在直播場景中的應用範圍。我們提出PersonaLive——一個基於擴散模型的新型框架,通過多階段訓練方案實現串流式即時肖像動畫。具體而言,我們首先採用混合隱式信號(即隱式面部表徵與3D隱式關鍵點)來實現具表現力的圖像級運動控制。隨後提出一種少步數外觀蒸餾策略,以消除去噪過程中的外觀冗餘,大幅提升推理效率。最後,我們引入搭載滑動訓練策略與歷史關鍵幀機制的自回歸微片段串流生成範式,實現低延遲且穩定的長時影片生成。大量實驗表明,PersonaLive在達到最先進性能的同時,相較於現有基於擴散模型的肖像動畫模型實現了最高7至22倍的加速效果。
English
Current diffusion-based portrait animation models predominantly focus on enhancing visual quality and expression realism, while overlooking generation latency and real-time performance, which restricts their application range in the live streaming scenario. We propose PersonaLive, a novel diffusion-based framework towards streaming real-time portrait animation with multi-stage training recipes. Specifically, we first adopt hybrid implicit signals, namely implicit facial representations and 3D implicit keypoints, to achieve expressive image-level motion control. Then, a fewer-step appearance distillation strategy is proposed to eliminate appearance redundancy in the denoising process, greatly improving inference efficiency. Finally, we introduce an autoregressive micro-chunk streaming generation paradigm equipped with a sliding training strategy and a historical keyframe mechanism to enable low-latency and stable long-term video generation. Extensive experiments demonstrate that PersonaLive achieves state-of-the-art performance with up to 7-22x speedup over prior diffusion-based portrait animation models.
PDF252December 17, 2025