基于扩散变换器高效自适应实现的反射消除
Reflection Removal through Efficient Adaptation of Diffusion Transformers
December 4, 2025
作者: Daniyar Zakarin, Thiemo Wandel, Anton Obukhov, Dengxin Dai
cs.AI
摘要
我们提出了一种基于扩散变换器(DiT)的单图像反光消除框架,该框架在复原任务中充分利用基础扩散模型的泛化能力。与依赖特定任务架构不同,我们通过将预训练的DiT基础模型以反光污染图像作为条件输入,并引导其生成洁净透射层来实现功能重构。系统分析了现有反光消除数据源在多样性、可扩展性和照片真实感方面的特性后,为弥补合适数据的短缺,我们在Blender中构建了基于物理渲染(PBR)的合成管线,围绕原理化BSDF生成逼真的玻璃材质与反光效果。基于LoRA的基础模型高效适配方法,结合提出的合成数据,在域内和零样本基准测试中均实现了最先进性能。这些结果表明:预训练扩散变换器与物理真实的数据合成及高效适配技术结合后,可为反光消除任务提供可扩展的高保真解决方案。项目页面:https://hf.co/spaces/huawei-bayerlab/windowseat-reflection-removal-web
English
We introduce a diffusion-transformer (DiT) framework for single-image reflection removal that leverages the generalization strengths of foundation diffusion models in the restoration setting. Rather than relying on task-specific architectures, we repurpose a pre-trained DiT-based foundation model by conditioning it on reflection-contaminated inputs and guiding it toward clean transmission layers. We systematically analyze existing reflection removal data sources for diversity, scalability, and photorealism. To address the shortage of suitable data, we construct a physically based rendering (PBR) pipeline in Blender, built around the Principled BSDF, to synthesize realistic glass materials and reflection effects. Efficient LoRA-based adaptation of the foundation model, combined with the proposed synthetic data, achieves state-of-the-art performance on in-domain and zero-shot benchmarks. These results demonstrate that pretrained diffusion transformers, when paired with physically grounded data synthesis and efficient adaptation, offer a scalable and high-fidelity solution for reflection removal. Project page: https://hf.co/spaces/huawei-bayerlab/windowseat-reflection-removal-web