OmniPSD:基于扩散变换器的分层PSD生成
OmniPSD: Layered PSD Generation with Diffusion Transformer
December 10, 2025
作者: Cheng Liu, Yiren Song, Haofan Wang, Mike Zheng Shou
cs.AI
摘要
扩散模型的最新进展显著提升了图像生成与编辑能力,然而生成或重建具有透明通道的分层PSD文件仍极具挑战。我们提出OmniPSD——基于Flux生态构建的统一扩散框架,通过情境学习实现文本到PSD生成与图像到PSD解构的双重功能。在文本到PSD生成任务中,OmniPSD将多个目标图层空间排布于单一画布,通过空间注意力机制学习其组合关系,生成语义连贯且层次分明的图层结构。对于图像到PSD解构任务,该框架执行迭代式情境编辑,逐步提取并擦除文本与前景区块,从单张扁平化图像重建可编辑的PSD图层。我们采用RGBA-VAE作为辅助表征模块,在保持透明度的同时不影响结构学习。基于新建的RGBA分层数据集进行大量实验表明,OmniPSD在生成保真度、结构一致性与透明度感知方面表现卓越,为扩散 Transformer实现分层设计生成与解构提供了新范式。
English
Recent advances in diffusion models have greatly improved image generation and editing, yet generating or reconstructing layered PSD files with transparent alpha channels remains highly challenging. We propose OmniPSD, a unified diffusion framework built upon the Flux ecosystem that enables both text-to-PSD generation and image-to-PSD decomposition through in-context learning. For text-to-PSD generation, OmniPSD arranges multiple target layers spatially into a single canvas and learns their compositional relationships through spatial attention, producing semantically coherent and hierarchically structured layers. For image-to-PSD decomposition, it performs iterative in-context editing, progressively extracting and erasing textual and foreground components to reconstruct editable PSD layers from a single flattened image. An RGBA-VAE is employed as an auxiliary representation module to preserve transparency without affecting structure learning. Extensive experiments on our new RGBA-layered dataset demonstrate that OmniPSD achieves high-fidelity generation, structural consistency, and transparency awareness, offering a new paradigm for layered design generation and decomposition with diffusion transformers.