ChatPaper.aiChatPaper

RegionE:面向高效图像编辑的自适应区域感知生成技术

RegionE: Adaptive Region-Aware Generation for Efficient Image Editing

October 29, 2025
作者: Pengtao Chen, Xianfang Zeng, Maosen Zhao, Mingzhu Shen, Peng Ye, Bangyin Xiang, Zhibo Wang, Wei Cheng, Gang Yu, Tao Chen
cs.AI

摘要

近期,基于指令的图像编辑技术受到广泛关注。实际应用中,该技术通常仅需修改图像的特定区域,而其余大部分区域保持不变。尽管这两类区域在生成难度和计算冗余度上存在显著差异,但现有模型并未考虑这种区别,而是对整个图像采用统一的生成流程。为此我们提出RegionE——一种自适应区域感知生成框架,无需额外训练即可加速图像编辑任务。该框架包含三个核心组件:1)自适应区域划分。通过观察发现未编辑区域的生成轨迹呈直线状,允许通过单步推理预测多步去噪结果。因此在去噪早期阶段,我们根据最终预估结果与参考图像的差异将图像划分为编辑区和未编辑区;2)区域感知生成。区分区域后,对未编辑区域用单步预测替代多步去噪;对于轨迹呈弯曲状的编辑区域,则采用局部迭代去噪。为提升局部迭代生成的效率与质量,我们提出区域指令键值缓存机制,在降低计算成本的同时融入全局信息;3)自适应速度衰减缓存。基于编辑区域相邻时间步间存在强速度相似性的观察,我们进一步设计自适应速度衰减缓存来加速局部去噪过程。将RegionE应用于Step1X-Edit、FLUX.1 Kontext和Qwen-Image-Edit等前沿模型后,分别实现了2.57倍、2.41倍和2.06倍的加速效果。GPT-4o评估证实该方法在保持语义连贯性与感知保真度方面表现优异。
English
Recently, instruction-based image editing (IIE) has received widespread attention. In practice, IIE often modifies only specific regions of an image, while the remaining areas largely remain unchanged. Although these two types of regions differ significantly in generation difficulty and computational redundancy, existing IIE models do not account for this distinction, instead applying a uniform generation process across the entire image. This motivates us to propose RegionE, an adaptive, region-aware generation framework that accelerates IIE tasks without additional training. Specifically, the RegionE framework consists of three main components: 1) Adaptive Region Partition. We observed that the trajectory of unedited regions is straight, allowing for multi-step denoised predictions to be inferred in a single step. Therefore, in the early denoising stages, we partition the image into edited and unedited regions based on the difference between the final estimated result and the reference image. 2) Region-Aware Generation. After distinguishing the regions, we replace multi-step denoising with one-step prediction for unedited areas. For edited regions, the trajectory is curved, requiring local iterative denoising. To improve the efficiency and quality of local iterative generation, we propose the Region-Instruction KV Cache, which reduces computational cost while incorporating global information. 3) Adaptive Velocity Decay Cache. Observing that adjacent timesteps in edited regions exhibit strong velocity similarity, we further propose an adaptive velocity decay cache to accelerate the local denoising process. We applied RegionE to state-of-the-art IIE base models, including Step1X-Edit, FLUX.1 Kontext, and Qwen-Image-Edit. RegionE achieved acceleration factors of 2.57, 2.41, and 2.06. Evaluations by GPT-4o confirmed that semantic and perceptual fidelity were well preserved.
PDF271December 2, 2025