FlexPainter:灵活且多视角一致的纹理生成技术
FlexPainter: Flexible and Multi-View Consistent Texture Generation
June 3, 2025
作者: Dongyu Yan, Leyi Wu, Jiantao Lin, Luozhou Wang, Tianshuo Xu, Zhifei Chen, Zhen Yang, Lie Xu, Shunsi Zhang, Yingcong Chen
cs.AI
摘要
纹理贴图制作是三维建模中的重要环节,直接决定了渲染质量。近年来,基于扩散模型的方法为纹理生成开辟了新途径。然而,受限的控制灵活性和有限的提示模态可能阻碍创作者实现预期效果。此外,生成的多视角图像间的不一致性往往导致纹理生成质量欠佳。针对这些问题,我们提出了FlexPainter,一种新颖的纹理生成流程,它支持灵活的多模态条件引导,并实现了高度一致的纹理生成。我们构建了一个共享的条件嵌入空间,以在不同输入模态间进行灵活聚合。利用这一嵌入空间,我们提出了一种基于图像的CFG方法,用于分解结构与风格信息,实现基于参考图像的风格化。借助图像扩散先验中的三维知识,我们首先采用网格表示同时生成多视角图像,以增强全局理解。同时,我们在扩散采样过程中引入了视角同步与自适应加权模块,进一步确保局部一致性。最后,结合纹理增强模型的三维感知纹理补全模型被用于生成无缝、高分辨率的纹理贴图。综合实验表明,我们的框架在灵活性和生成质量上均显著优于现有最先进方法。
English
Texture map production is an important part of 3D modeling and determines the
rendering quality. Recently, diffusion-based methods have opened a new way for
texture generation. However, restricted control flexibility and limited prompt
modalities may prevent creators from producing desired results. Furthermore,
inconsistencies between generated multi-view images often lead to poor texture
generation quality. To address these issues, we introduce FlexPainter,
a novel texture generation pipeline that enables flexible multi-modal
conditional guidance and achieves highly consistent texture generation. A
shared conditional embedding space is constructed to perform flexible
aggregation between different input modalities. Utilizing such embedding space,
we present an image-based CFG method to decompose structural and style
information, achieving reference image-based stylization. Leveraging the 3D
knowledge within the image diffusion prior, we first generate multi-view images
simultaneously using a grid representation to enhance global understanding.
Meanwhile, we propose a view synchronization and adaptive weighting module
during diffusion sampling to further ensure local consistency. Finally, a
3D-aware texture completion model combined with a texture enhancement model is
used to generate seamless, high-resolution texture maps. Comprehensive
experiments demonstrate that our framework significantly outperforms
state-of-the-art methods in both flexibility and generation quality.