AniPortrait:音訊驅動的逼真肖像動畫合成
AniPortrait: Audio-Driven Synthesis of Photorealistic Portrait Animation
March 26, 2024
作者: Huawei Wei, Zejun Yang, Zhisheng Wang
cs.AI
摘要
在這項研究中,我們提出了AniPortrait,一個新穎的框架,用於生成由音訊和參考肖像圖驅動的高質量動畫。我們的方法論分為兩個階段。首先,我們從音訊中提取3D中間表示,並將其投影到一系列2D面部特徵點上。隨後,我們採用強大的擴散模型,結合運動模組,將特徵點序列轉換為逼真且時間一致的肖像動畫。實驗結果顯示AniPortrait在面部自然性、姿勢多樣性和視覺質量方面的優越性,從而提供了增強的知覺體驗。此外,我們的方法論在靈活性和可控性方面具有相當大的潛力,可以有效應用於面部運動編輯或面部重現等領域。我們在https://github.com/scutzzj/AniPortrait 上釋出了代碼和模型權重。
English
In this study, we propose AniPortrait, a novel framework for generating
high-quality animation driven by audio and a reference portrait image. Our
methodology is divided into two stages. Initially, we extract 3D intermediate
representations from audio and project them into a sequence of 2D facial
landmarks. Subsequently, we employ a robust diffusion model, coupled with a
motion module, to convert the landmark sequence into photorealistic and
temporally consistent portrait animation. Experimental results demonstrate the
superiority of AniPortrait in terms of facial naturalness, pose diversity, and
visual quality, thereby offering an enhanced perceptual experience. Moreover,
our methodology exhibits considerable potential in terms of flexibility and
controllability, which can be effectively applied in areas such as facial
motion editing or face reenactment. We release code and model weights at
https://github.com/scutzzj/AniPortraitSummary
AI-Generated Summary