ChatPaper.aiChatPaper

從規模到速度:圖像編輯的自適應測試時縮放技術

From Scale to Speed: Adaptive Test-Time Scaling for Image Editing

February 24, 2026
作者: Xiangyan Qu, Zhenlong Yuan, Jing Tang, Rui Chen, Datao Tang, Meng Yu, Lei Sun, Yancheng Bai, Xiangxiang Chu, Gaopeng Gou, Gang Xiong, Yujun Cai
cs.AI

摘要

圖像思維鏈(Image-CoT)是一種透過延長推理時間來提升圖像生成效果的測試時擴展範式。現有方法主要聚焦於文字到圖像(T2I)生成,但圖像編輯具有目標導向性:其解空間受源圖像和編輯指令的雙重約束。這種差異導致Image-CoT應用於編輯任務時面臨三大挑戰:固定採樣預算下的資源分配低效、通用多模態大模型評分用於早期驗證不可靠,以及大規模採樣導致編輯結果冗餘。為此,我們提出自適應編輯思維鏈(ADE-CoT),一種按需調配的測試時擴展框架,以提升編輯效率與效果。該框架包含三大核心策略:(1)難度感知資源分配機制,根據預估編輯難度動態分配預算;(2)早期剪枝中的編輯專用驗證,通過區域定位與標題一致性篩選潛力候選;(3)由實例專用驗證器引導的深度優先機會性停止策略,在發現意圖吻合結果時立即終止採樣。在三個前沿編輯模型(Step1X-Edit、BAGEL、FLUX.1 Kontext)和三個基準測試上的大量實驗表明,ADE-CoT實現了更優的性能-效率平衡。在相同採樣預算下,ADE-CoT不僅性能更佳,相較於N選最優法(Best-of-N)更獲得超過2倍的加速效果。
English
Image Chain-of-Thought (Image-CoT) is a test-time scaling paradigm that improves image generation by extending inference time. Most Image-CoT methods focus on text-to-image (T2I) generation. Unlike T2I generation, image editing is goal-directed: the solution space is constrained by the source image and instruction. This mismatch causes three challenges when applying Image-CoT to editing: inefficient resource allocation with fixed sampling budgets, unreliable early-stage verification using general MLLM scores, and redundant edited results from large-scale sampling. To address this, we propose ADaptive Edit-CoT (ADE-CoT), an on-demand test-time scaling framework to enhance editing efficiency and performance. It incorporates three key strategies: (1) a difficulty-aware resource allocation that assigns dynamic budgets based on estimated edit difficulty; (2) edit-specific verification in early pruning that uses region localization and caption consistency to select promising candidates; and (3) depth-first opportunistic stopping, guided by an instance-specific verifier, that terminates when intent-aligned results are found. Extensive experiments on three SOTA editing models (Step1X-Edit, BAGEL, FLUX.1 Kontext) across three benchmarks show that ADE-CoT achieves superior performance-efficiency trade-offs. With comparable sampling budgets, ADE-CoT obtains better performance with more than 2x speedup over Best-of-N.
PDF1141March 4, 2026