ChatPaper.aiChatPaper

OmniPSD:基於擴散轉換器的分層式PSD生成技術

OmniPSD: Layered PSD Generation with Diffusion Transformer

December 10, 2025
作者: Cheng Liu, Yiren Song, Haofan Wang, Mike Zheng Shou
cs.AI

摘要

近期擴散模型的進展大幅提升了圖像生成與編輯能力,然而生成或重建具有透明Alpha通道的分層PSD文件仍極具挑戰性。我們提出OmniPSD——一個基於Flux生態系統構建的統一擴散框架,通過情境學習實現文本到PSD生成與圖像到PSD分解的雙重功能。在文本到PSD生成任務中,OmniPSD將多個目標圖層空間排列於單一畫布,通過空間注意力機制學習其組合關係,生成語義連貫且具層次結構的圖層。對於圖像到PSD分解任務,該框架執行迭代式情境編輯,逐步提取並擦除文本與前景組件,從單張扁平化圖像重建可編輯的PSD圖層。我們採用RGBA-VAE作為輔助表徵模組,在不影響結構學習的前提下保持透明度特性。基於新建的RGBA分層數據集進行大量實驗表明,OmniPSD能實現高保真度生成、結構一致性與透明度感知,為擴散變換器在分層設計生成與分解領域開闢了新範式。
English
Recent advances in diffusion models have greatly improved image generation and editing, yet generating or reconstructing layered PSD files with transparent alpha channels remains highly challenging. We propose OmniPSD, a unified diffusion framework built upon the Flux ecosystem that enables both text-to-PSD generation and image-to-PSD decomposition through in-context learning. For text-to-PSD generation, OmniPSD arranges multiple target layers spatially into a single canvas and learns their compositional relationships through spatial attention, producing semantically coherent and hierarchically structured layers. For image-to-PSD decomposition, it performs iterative in-context editing, progressively extracting and erasing textual and foreground components to reconstruct editable PSD layers from a single flattened image. An RGBA-VAE is employed as an auxiliary representation module to preserve transparency without affecting structure learning. Extensive experiments on our new RGBA-layered dataset demonstrate that OmniPSD achieves high-fidelity generation, structural consistency, and transparency awareness, offering a new paradigm for layered design generation and decomposition with diffusion transformers.
PDF403December 13, 2025