ChatPaper.aiChatPaper

可逆多层图像生成的可控层解构技术

Controllable Layer Decomposition for Reversible Multi-Layer Image Generation

November 20, 2025
作者: Zihao Liu, Zunnan Xu, Shi Shu, Jun Zhou, Ruicheng Zhang, Zhenchao Tang, Xiu Li
cs.AI

摘要

本研究提出了可控图层分解(CLD)方法,旨在实现栅格图像的精细化可控多层分离。在实际设计流程中,设计师通常先独立生成并编辑每个RGBA图层,再将其合成为最终栅格图像。然而这一过程不可逆:一旦合成后,便无法进行图层级编辑。现有方法多依赖于图像抠图与修复技术,但在可控性与分割精度方面仍存在局限。为解决这些挑战,我们提出两个核心模块:LayerDecompose-DiT(LD-DiT)通过解耦图像元素至独立图层实现精细化控制;多层条件适配器(MLCA)通过向多层标记注入目标图像信息以实现精准条件生成。为进行全面评估,我们构建了新的测试基准并定制了专用评价指标。实验结果表明,CLD在分解质量与可控性方面均优于现有方法。此外,CLD分离出的图层可直接在PowerPoint等常用设计工具中进行编辑,凸显了其在真实创意工作流程中的实用价值与适用性。
English
This work presents Controllable Layer Decomposition (CLD), a method for achieving fine-grained and controllable multi-layer separation of raster images. In practical workflows, designers typically generate and edit each RGBA layer independently before compositing them into a final raster image. However, this process is irreversible: once composited, layer-level editing is no longer possible. Existing methods commonly rely on image matting and inpainting, but remain limited in controllability and segmentation precision. To address these challenges, we propose two key modules: LayerDecompose-DiT (LD-DiT), which decouples image elements into distinct layers and enables fine-grained control; and Multi-Layer Conditional Adapter (MLCA), which injects target image information into multi-layer tokens to achieve precise conditional generation. To enable a comprehensive evaluation, we build a new benchmark and introduce tailored evaluation metrics. Experimental results show that CLD consistently outperforms existing methods in both decomposition quality and controllability. Furthermore, the separated layers produced by CLD can be directly manipulated in commonly used design tools such as PowerPoint, highlighting its practical value and applicability in real-world creative workflows.
PDF92February 7, 2026