基于扩散变换器的高效自适应反射消除技术
Reflection Removal through Efficient Adaptation of Diffusion Transformers
December 4, 2025
作者: Daniyar Zakarin, Thiemo Wandel, Anton Obukhov, Dengxin Dai
cs.AI
摘要
我们提出了一种基于扩散变换器(DiT)的单图像反射消除框架,该框架在复原任务中充分发挥了基础扩散模型的泛化优势。与依赖特定任务架构不同,我们通过让预训练的DiT基础模型以含反射污染的输入为条件,并引导其生成洁净的透射层,实现了模型的重新赋能。我们系统分析了现有反射消除数据源在多样性、可扩展性和照片真实感方面的特性。针对合适数据匮乏的问题,我们在Blender中构建了基于物理渲染(PBR)的管线,围绕Principled BSDF材质合成逼真的玻璃材质与反射效果。基于LoRA的基础模型高效适配方法,结合提出的合成数据,在域内基准测试和零样本基准测试中均实现了最先进的性能。这些结果表明:预训练的扩散变换器与基于物理原理的数据合成及高效适配技术相结合,可为反射消除任务提供可扩展的高保真解决方案。项目页面:https://hf.co/spaces/huawei-bayerlab/windowseat-reflection-removal-web
English
We introduce a diffusion-transformer (DiT) framework for single-image reflection removal that leverages the generalization strengths of foundation diffusion models in the restoration setting. Rather than relying on task-specific architectures, we repurpose a pre-trained DiT-based foundation model by conditioning it on reflection-contaminated inputs and guiding it toward clean transmission layers. We systematically analyze existing reflection removal data sources for diversity, scalability, and photorealism. To address the shortage of suitable data, we construct a physically based rendering (PBR) pipeline in Blender, built around the Principled BSDF, to synthesize realistic glass materials and reflection effects. Efficient LoRA-based adaptation of the foundation model, combined with the proposed synthetic data, achieves state-of-the-art performance on in-domain and zero-shot benchmarks. These results demonstrate that pretrained diffusion transformers, when paired with physically grounded data synthesis and efficient adaptation, offer a scalable and high-fidelity solution for reflection removal. Project page: https://hf.co/spaces/huawei-bayerlab/windowseat-reflection-removal-web