FaceStudio:在几秒钟内将您的脸放在任何地方
FaceStudio: Put Your Face Everywhere in Seconds
December 5, 2023
作者: Yuxuan Yan, Chi Zhang, Rui Wang, Pei Cheng, Gang Yu, Bin Fu
cs.AI
摘要
本研究探讨了保持身份的图像合成,这是图像生成中的一个引人注目的任务,旨在在保持主体身份的同时增加个性化、风格化的触感。传统方法,如文本反转和梦幻摄影亭,在定制图像创建方面取得了进展,但存在显著缺点。这些包括需要大量资源和时间进行微调,以及需要多个参考图像。为了克服这些挑战,我们的研究引入了一种新颖的保持身份合成方法,特别关注人类图像。我们的模型利用直接前馈机制,避免了需要进行密集微调,从而促进快速高效的图像生成。我们创新的核心是混合引导框架,结合了风格化图像、面部图像和文本提示来引导图像生成过程。这种独特组合使我们的模型能够产生各种应用,如艺术肖像和身份融合图像。我们的实验结果,包括定性和定量评估,展示了我们的方法在效率和保持主体身份的高保真度方面优于现有基准模型和先前作品,尤其是其卓越的效率和能力。
English
This study investigates identity-preserving image synthesis, an intriguing
task in image generation that seeks to maintain a subject's identity while
adding a personalized, stylistic touch. Traditional methods, such as Textual
Inversion and DreamBooth, have made strides in custom image creation, but they
come with significant drawbacks. These include the need for extensive resources
and time for fine-tuning, as well as the requirement for multiple reference
images. To overcome these challenges, our research introduces a novel approach
to identity-preserving synthesis, with a particular focus on human images. Our
model leverages a direct feed-forward mechanism, circumventing the need for
intensive fine-tuning, thereby facilitating quick and efficient image
generation. Central to our innovation is a hybrid guidance framework, which
combines stylized images, facial images, and textual prompts to guide the image
generation process. This unique combination enables our model to produce a
variety of applications, such as artistic portraits and identity-blended
images. Our experimental results, including both qualitative and quantitative
evaluations, demonstrate the superiority of our method over existing baseline
models and previous works, particularly in its remarkable efficiency and
ability to preserve the subject's identity with high fidelity.