ChatPaper.aiChatPaper

適用於預訓練擴散模型的臉部適配器:具備細粒度身份與屬性控制功能

Face Adapter for Pre-Trained Diffusion Models with Fine-Grained ID and Attribute Control

May 21, 2024
作者: Yue Han, Junwei Zhu, Keke He, Xu Chen, Yanhao Ge, Wei Li, Xiangtai Li, Jiangning Zhang, Chengjie Wang, Yong Liu
cs.AI

摘要

當前的人臉重演與替換方法主要依賴生成對抗網絡框架,但近期研究焦點已轉向預訓練擴散模型,因其具備更優越的生成能力。然而,訓練這類模型需要大量計算資源,且現有成果尚未達到理想性能水平。為解決此問題,我們提出Face-Adapter——一種專為預訓練擴散模型設計的高效適配器,可實現高精度與高保真度的人臉編輯。我們觀察到,無論是人臉重演或替換任務,本質上均涉及目標結構、身份特徵與屬性的組合。我們的目標是充分解耦對這些因素的控制,以單一模型實現雙重任務。具體而言,本方法包含:1)空間條件生成器,提供精確人臉關鍵點與背景;2)即插即用身份編碼器,通過轉碼器將人臉嵌入向量轉換至文本空間;3)屬性控制器,整合空間條件與細節屬性。相比完全微調的人臉重演/替換模型,Face-Adapter在動作控制精度、身份保持能力與生成質量方面達到相當甚至更優的性能。此外,該適配器可無縫集成於多種StableDiffusion模型。
English
Current face reenactment and swapping methods mainly rely on GAN frameworks, but recent focus has shifted to pre-trained diffusion models for their superior generation capabilities. However, training these models is resource-intensive, and the results have not yet achieved satisfactory performance levels. To address this issue, we introduce Face-Adapter, an efficient and effective adapter designed for high-precision and high-fidelity face editing for pre-trained diffusion models. We observe that both face reenactment/swapping tasks essentially involve combinations of target structure, ID and attribute. We aim to sufficiently decouple the control of these factors to achieve both tasks in one model. Specifically, our method contains: 1) A Spatial Condition Generator that provides precise landmarks and background; 2) A Plug-and-play Identity Encoder that transfers face embeddings to the text space by a transformer decoder. 3) An Attribute Controller that integrates spatial conditions and detailed attributes. Face-Adapter achieves comparable or even superior performance in terms of motion control precision, ID retention capability, and generation quality compared to fully fine-tuned face reenactment/swapping models. Additionally, Face-Adapter seamlessly integrates with various StableDiffusion models.
PDF255February 8, 2026