StoryMaker:实现文本到图像生成中的整体一致角色
StoryMaker: Towards Holistic Consistent Characters in Text-to-image Generation
September 19, 2024
作者: Zhengguang Zhou, Jing Li, Huaxia Li, Nemo Chen, Xu Tang
cs.AI
摘要
无需调参的个性化图像生成方法在保持面部一致性(即身份)方面取得了显著成功,即使涉及多个角色也是如此。然而,在涉及多个角色的场景中缺乏整体一致性会阻碍这些方法创造连贯叙事的能力。本文介绍了StoryMaker,这是一种个性化解决方案,不仅保留了面部一致性,还包括服装、发型和身体一致性,从而通过一系列图像促进故事的创作。StoryMaker结合了基于面部身份和裁剪角色图像(包括服装、发型和身体)的条件。具体而言,我们使用位置感知感知器重采样器(PPR)将面部身份信息与裁剪角色图像整合,以获得独特的角色特征。为了防止多个角色和背景相互混合,我们使用带有分割掩模的MSE损失分别约束不同角色和背景的交叉注意力影响区域。此外,我们训练生成网络以姿势为条件,以促进与姿势的解耦。还采用了LoRA来增强保真度和质量。实验证明了我们方法的有效性。StoryMaker支持多种应用,并与其他社会插件兼容。我们的源代码和模型权重可在https://github.com/RedAIGC/StoryMaker获得。
English
Tuning-free personalized image generation methods have achieved significant
success in maintaining facial consistency, i.e., identities, even with multiple
characters. However, the lack of holistic consistency in scenes with multiple
characters hampers these methods' ability to create a cohesive narrative. In
this paper, we introduce StoryMaker, a personalization solution that preserves
not only facial consistency but also clothing, hairstyles, and body
consistency, thus facilitating the creation of a story through a series of
images. StoryMaker incorporates conditions based on face identities and cropped
character images, which include clothing, hairstyles, and bodies. Specifically,
we integrate the facial identity information with the cropped character images
using the Positional-aware Perceiver Resampler (PPR) to obtain distinct
character features. To prevent intermingling of multiple characters and the
background, we separately constrain the cross-attention impact regions of
different characters and the background using MSE loss with segmentation masks.
Additionally, we train the generation network conditioned on poses to promote
decoupling from poses. A LoRA is also employed to enhance fidelity and quality.
Experiments underscore the effectiveness of our approach. StoryMaker supports
numerous applications and is compatible with other societal plug-ins. Our
source codes and model weights are available at
https://github.com/RedAIGC/StoryMaker.Summary
AI-Generated Summary