ChatPaper.aiChatPaper

WiseEdit:认知与创意驱动的图像编辑基准评测

WiseEdit: Benchmarking Cognition- and Creativity-Informed Image Editing

November 29, 2025
作者: Kaihang Pan, Weile Chen, Haiyi Qiu, Qifan Yu, Wendong Bu, Zehan Wang, Yun Zhu, Juncheng Li, Siliang Tang
cs.AI

摘要

当前图像编辑模型已具备高水平的智能编辑能力,能够实现认知引导与创意驱动的图像处理。然而现有评测基准的评估维度过于局限,难以系统评估这些高阶能力。为此,我们推出WiseEdit——一个知识密集型评测基准,通过深层任务难度与广泛知识跨度,对认知与创意驱动的图像编辑进行综合评估。该基准借鉴人类认知创造过程,将图像编辑解构为感知、解析与想象三个递进阶段,每个阶段对应特定任务以检验模型在该环节的完成能力,同时设置需多步骤协同的复合型任务。此外,WiseEdit融合了陈述性、程序性及元认知三大知识类型,最终构建包含1,220个测试案例的评估体系,客观揭示了当前最先进图像编辑模型在知识化认知推理与创意组合能力方面的局限。评测基准、评估代码及各模型生成图像将公开发布。项目页面:https://qnancy.github.io/wiseedit_project_page/。
English
Recent image editing models boast next-level intelligent capabilities, facilitating cognition- and creativity-informed image editing. Yet, existing benchmarks provide too narrow a scope for evaluation, failing to holistically assess these advanced abilities. To address this, we introduce WiseEdit, a knowledge-intensive benchmark for comprehensive evaluation of cognition- and creativity-informed image editing, featuring deep task depth and broad knowledge breadth. Drawing an analogy to human cognitive creation, WiseEdit decomposes image editing into three cascaded steps, i.e., Awareness, Interpretation, and Imagination, each corresponding to a task that poses a challenge for models to complete at the specific step. It also encompasses complex tasks, where none of the three steps can be finished easily. Furthermore, WiseEdit incorporates three fundamental types of knowledge: Declarative, Procedural, and Metacognitive knowledge. Ultimately, WiseEdit comprises 1,220 test cases, objectively revealing the limitations of SoTA image editing models in knowledge-based cognitive reasoning and creative composition capabilities. The benchmark, evaluation code, and the generated images of each model will be made publicly available soon. Project Page: https://qnancy.github.io/wiseedit_project_page/.
PDF21December 3, 2025