HyperDreamBooth:用于快速个性化文本到图像模型的超网络
HyperDreamBooth: HyperNetworks for Fast Personalization of Text-to-Image Models
July 13, 2023
作者: Nataniel Ruiz, Yuanzhen Li, Varun Jampani, Wei Wei, Tingbo Hou, Yael Pritch, Neal Wadhwa, Michael Rubinstein, Kfir Aberman
cs.AI
摘要
个性化已成为生成人工智能领域的一个重要方面,使得能够在不同背景和风格中合成个体,同时保持对其身份的高保真度。然而,个性化过程在时间和内存需求方面存在固有挑战。对每个个性化模型进行微调需要大量的GPU时间投资,并且为每个主题存储一个个性化模型在存储容量方面可能要求很高。为了克服这些挑战,我们提出了HyperDreamBooth——一种能够从一个人的单张图像中高效生成一小组个性化权重的超网络。通过将这些权重组合到扩散模型中,再结合快速微调,HyperDreamBooth能够在各种背景和风格中生成一个人的面部,具有高主题细节,同时保留模型对不同风格和语义修改的重要知识。我们的方法在大约20秒内实现了对面部的个性化,比DreamBooth快25倍,比Textual Inversion快125倍,仅使用一张参考图像,同时具有与DreamBooth相同的质量和风格多样性。此外,我们的方法生成的模型比普通DreamBooth模型小10000倍。项目页面:https://hyperdreambooth.github.io
English
Personalization has emerged as a prominent aspect within the field of
generative AI, enabling the synthesis of individuals in diverse contexts and
styles, while retaining high-fidelity to their identities. However, the process
of personalization presents inherent challenges in terms of time and memory
requirements. Fine-tuning each personalized model needs considerable GPU time
investment, and storing a personalized model per subject can be demanding in
terms of storage capacity. To overcome these challenges, we propose
HyperDreamBooth-a hypernetwork capable of efficiently generating a small set of
personalized weights from a single image of a person. By composing these
weights into the diffusion model, coupled with fast finetuning, HyperDreamBooth
can generate a person's face in various contexts and styles, with high subject
details while also preserving the model's crucial knowledge of diverse styles
and semantic modifications. Our method achieves personalization on faces in
roughly 20 seconds, 25x faster than DreamBooth and 125x faster than Textual
Inversion, using as few as one reference image, with the same quality and style
diversity as DreamBooth. Also our method yields a model that is 10000x smaller
than a normal DreamBooth model. Project page: https://hyperdreambooth.github.io