ChatPaper.aiChatPaper

ID-Aligner:利用奖励反馈学习增强保持身份的文本到图像生成

ID-Aligner: Enhancing Identity-Preserving Text-to-Image Generation with Reward Feedback Learning

April 23, 2024
作者: Weifeng Chen, Jiacheng Zhang, Jie Wu, Hefeng Wu, Xuefeng Xiao, Liang Lin
cs.AI

摘要

扩散模型的快速发展已经引发了多样化的应用。特别是保持身份的文本到图像生成(ID-T2I)因其广泛的应用场景,如人工智能肖像和广告,受到了重视。尽管现有的ID-T2I方法已经展示出令人印象深刻的结果,但仍然存在几个关键挑战:(1)很难准确保持参考肖像的身份特征,(2)生成的图像在强调身份保留时缺乏审美吸引力,(3)存在无法同时兼容LoRA和Adapter方法的限制。为了解决这些问题,我们提出了ID-Aligner,这是一个通用的反馈学习框架,用于增强ID-T2I的性能。为了解决丢失的身份特征,我们引入了身份一致性奖励微调,利用来自人脸检测和识别模型的反馈来改善生成的身份保留。此外,我们提出了身份美学奖励微调,利用人工注释的偏好数据和自动构建的角色结构生成反馈,提供审美调整信号。由于其通用的反馈微调框架,我们的方法可以轻松应用于LoRA和Adapter模型,实现一致的性能提升。在SD1.5和SDXL扩散模型上的大量实验证实了我们方法的有效性。项目页面:\url{https://idaligner.github.io/}
English
The rapid development of diffusion models has triggered diverse applications. Identity-preserving text-to-image generation (ID-T2I) particularly has received significant attention due to its wide range of application scenarios like AI portrait and advertising. While existing ID-T2I methods have demonstrated impressive results, several key challenges remain: (1) It is hard to maintain the identity characteristics of reference portraits accurately, (2) The generated images lack aesthetic appeal especially while enforcing identity retention, and (3) There is a limitation that cannot be compatible with LoRA-based and Adapter-based methods simultaneously. To address these issues, we present ID-Aligner, a general feedback learning framework to enhance ID-T2I performance. To resolve identity features lost, we introduce identity consistency reward fine-tuning to utilize the feedback from face detection and recognition models to improve generated identity preservation. Furthermore, we propose identity aesthetic reward fine-tuning leveraging rewards from human-annotated preference data and automatically constructed feedback on character structure generation to provide aesthetic tuning signals. Thanks to its universal feedback fine-tuning framework, our method can be readily applied to both LoRA and Adapter models, achieving consistent performance gains. Extensive experiments on SD1.5 and SDXL diffusion models validate the effectiveness of our approach. Project Page: \url{https://idaligner.github.io/}

Summary

AI-Generated Summary

PDF141December 15, 2024