CreativeSynth:基于多模态扩散的视觉艺术创意融合与合成
CreativeSynth: Creative Blending and Synthesis of Visual Arts based on Multimodal Diffusion
January 25, 2024
作者: Nisha Huang, Weiming Dong, Yuxin Zhang, Fan Tang, Ronghui Li, Chongyang Ma, Xiu Li, Changsheng Xu
cs.AI
摘要
大规模文本到图像生成模型取得了令人瞩目的进展,展示了它们合成各种高质量图像的能力。然而,将这些模型调整用于艺术图像编辑面临两个重要挑战。首先,用户很难精心制作详细描述输入图像视觉元素的文本提示。其次,流行的模型在影响特定区域的修改时,经常会破坏整体艺术风格,使得实现连贯和美学统一的艺术作品变得复杂。为了克服这些障碍,我们构建了基于扩散模型的创新统一框架CreativeSynth,该框架具有协调多模态输入和在艺术图像生成领域多任务处理的能力。通过将多模态特征与定制的注意力机制相结合,CreativeSynth促进了将现实世界语义内容通过反演和实时风格转移导入艺术领域。这使得能够精确操纵图像风格和内容,同时保持原始模型参数的完整性。严格的定性和定量评估凸显了CreativeSynth在提升艺术图像保真度方面的优势,并保留了它们固有的美学本质。通过弥合生成模型和艺术精湛之间的鸿沟,CreativeSynth成为了一个定制的数字调色板。
English
Large-scale text-to-image generative models have made impressive strides,
showcasing their ability to synthesize a vast array of high-quality images.
However, adapting these models for artistic image editing presents two
significant challenges. Firstly, users struggle to craft textual prompts that
meticulously detail visual elements of the input image. Secondly, prevalent
models, when effecting modifications in specific zones, frequently disrupt the
overall artistic style, complicating the attainment of cohesive and
aesthetically unified artworks. To surmount these obstacles, we build the
innovative unified framework CreativeSynth, which is based on a diffusion model
with the ability to coordinate multimodal inputs and multitask in the field of
artistic image generation. By integrating multimodal features with customized
attention mechanisms, CreativeSynth facilitates the importation of real-world
semantic content into the domain of art through inversion and real-time style
transfer. This allows for the precise manipulation of image style and content
while maintaining the integrity of the original model parameters. Rigorous
qualitative and quantitative evaluations underscore that CreativeSynth excels
in enhancing artistic images' fidelity and preserves their innate aesthetic
essence. By bridging the gap between generative models and artistic finesse,
CreativeSynth becomes a custom digital palette.