OmniRefiner:基于强化学习引导的局部扩散优化
OmniRefiner: Reinforcement-Guided Local Diffusion Refinement
November 25, 2025
作者: Yaoli Liu, Ziheng Ouyang, Shengtao Lou, Yiren Song
cs.AI
摘要
参考引导的图像生成技术发展迅速,但当前扩散模型在依据参考图像优化生成结果时,仍难以保持细粒度的视觉细节。这一局限源于VAE潜在压缩机制固有地会丢弃细微纹理信息,导致身份特征与属性相关的视觉线索消失。此外,基于现有方法的局部细节增强后编辑方案,常会产生光照、纹理或形状方面与原图不一致的结果。为此,我们提出了细节感知优化框架,通过连续两阶段的参考驱动校正来提升像素级一致性。我们首先对单图像扩散编辑器进行适配微调,使其能同时处理草图与参考图像,在保持结构保真度的同时实现全局协调优化。随后应用强化学习进一步强化局部编辑能力,显式优化细节精度与语义一致性。大量实验表明,本方法在参考图像对齐和细粒度细节保留方面显著提升,在极具挑战性的参考引导修复基准测试中,生成的忠实且视觉连贯的编辑效果超越了开源与商业模型。
English
Reference-guided image generation has progressed rapidly, yet current diffusion models still struggle to preserve fine-grained visual details when refining a generated image using a reference. This limitation arises because VAE-based latent compression inherently discards subtle texture information, causing identity- and attribute-specific cues to vanish. Moreover, post-editing approaches that amplify local details based on existing methods often produce results inconsistent with the original image in terms of lighting, texture, or shape. To address this, we introduce , a detail-aware refinement framework that performs two consecutive stages of reference-driven correction to enhance pixel-level consistency. We first adapt a single-image diffusion editor by fine-tuning it to jointly ingest the draft image and the reference image, enabling globally coherent refinement while maintaining structural fidelity. We then apply reinforcement learning to further strengthen localized editing capability, explicitly optimizing for detail accuracy and semantic consistency. Extensive experiments demonstrate that significantly improves reference alignment and fine-grained detail preservation, producing faithful and visually coherent edits that surpass both open-source and commercial models on challenging reference-guided restoration benchmarks.