ChatPaper.aiChatPaper

StoryMaker:实现文本到图像生成中的整体一致角色

StoryMaker: Towards Holistic Consistent Characters in Text-to-image Generation

September 19, 2024
作者: Zhengguang Zhou, Jing Li, Huaxia Li, Nemo Chen, Xu Tang
cs.AI

摘要

无需调参的个性化图像生成方法在保持面部一致性(即身份)方面取得了显著成功,即使涉及多个角色也是如此。然而,在涉及多个角色的场景中缺乏整体一致性会阻碍这些方法创造连贯叙事的能力。本文介绍了StoryMaker,这是一种个性化解决方案,不仅保留了面部一致性,还包括服装、发型和身体一致性,从而通过一系列图像促进故事的创作。StoryMaker结合了基于面部身份和裁剪角色图像(包括服装、发型和身体)的条件。具体而言,我们使用位置感知感知器重采样器(PPR)将面部身份信息与裁剪角色图像整合,以获得独特的角色特征。为了防止多个角色和背景相互混合,我们使用带有分割掩模的MSE损失分别约束不同角色和背景的交叉注意力影响区域。此外,我们训练生成网络以姿势为条件,以促进与姿势的解耦。还采用了LoRA来增强保真度和质量。实验证明了我们方法的有效性。StoryMaker支持多种应用,并与其他社会插件兼容。我们的源代码和模型权重可在https://github.com/RedAIGC/StoryMaker获得。
English
Tuning-free personalized image generation methods have achieved significant success in maintaining facial consistency, i.e., identities, even with multiple characters. However, the lack of holistic consistency in scenes with multiple characters hampers these methods' ability to create a cohesive narrative. In this paper, we introduce StoryMaker, a personalization solution that preserves not only facial consistency but also clothing, hairstyles, and body consistency, thus facilitating the creation of a story through a series of images. StoryMaker incorporates conditions based on face identities and cropped character images, which include clothing, hairstyles, and bodies. Specifically, we integrate the facial identity information with the cropped character images using the Positional-aware Perceiver Resampler (PPR) to obtain distinct character features. To prevent intermingling of multiple characters and the background, we separately constrain the cross-attention impact regions of different characters and the background using MSE loss with segmentation masks. Additionally, we train the generation network conditioned on poses to promote decoupling from poses. A LoRA is also employed to enhance fidelity and quality. Experiments underscore the effectiveness of our approach. StoryMaker supports numerous applications and is compatible with other societal plug-ins. Our source codes and model weights are available at https://github.com/RedAIGC/StoryMaker.

Summary

AI-Generated Summary

PDF162November 16, 2024