FaceStudio:在幾秒鐘內將您的臉放在任何地方
FaceStudio: Put Your Face Everywhere in Seconds
December 5, 2023
作者: Yuxuan Yan, Chi Zhang, Rui Wang, Pei Cheng, Gang Yu, Bin Fu
cs.AI
摘要
本研究探討保持身份的圖像合成,這是圖像生成中一個引人入勝的任務,旨在在保持主題身份的同時添加個性化、風格化的元素。傳統方法,如文本反轉和夢幻攝影亭,在定製圖像創建方面取得了進展,但存在顯著缺點。這些包括需要大量資源和時間進行微調,以及需要多個參考圖像。為了克服這些挑戰,我們的研究引入了一種新的保持身份合成方法,特別關注人類圖像。我們的模型利用直接前饋機制,避免了對精細調整的需求,從而促進了快速高效的圖像生成。我們創新的核心是混合引導框架,結合了風格化圖像、面部圖像和文本提示,以引導圖像生成過程。這種獨特組合使我們的模型能夠產生各種應用,如藝術肖像和身份混合圖像。我們的實驗結果,包括定性和定量評估,顯示了我們的方法在效率和高度保留主題身份能力方面優於現有基準模型和先前作品,具有卓越的優勢。
English
This study investigates identity-preserving image synthesis, an intriguing
task in image generation that seeks to maintain a subject's identity while
adding a personalized, stylistic touch. Traditional methods, such as Textual
Inversion and DreamBooth, have made strides in custom image creation, but they
come with significant drawbacks. These include the need for extensive resources
and time for fine-tuning, as well as the requirement for multiple reference
images. To overcome these challenges, our research introduces a novel approach
to identity-preserving synthesis, with a particular focus on human images. Our
model leverages a direct feed-forward mechanism, circumventing the need for
intensive fine-tuning, thereby facilitating quick and efficient image
generation. Central to our innovation is a hybrid guidance framework, which
combines stylized images, facial images, and textual prompts to guide the image
generation process. This unique combination enables our model to produce a
variety of applications, such as artistic portraits and identity-blended
images. Our experimental results, including both qualitative and quantitative
evaluations, demonstrate the superiority of our method over existing baseline
models and previous works, particularly in its remarkable efficiency and
ability to preserve the subject's identity with high fidelity.