全景成对畸变图
Panoptic Pairwise Distortion Graph
April 13, 2026
作者: Muhammad Kamran Janjua, Abdul Wahab, Bahador Rashidi
cs.AI
摘要
本研究提出了一种比较式图像质量评估的新视角,通过将图像对表征为区域化结构化组合。与现有方法侧重于整体图像分析而隐式依赖区域级理解不同,我们将场景图的概念从图像内拓展至图像间,并提出"失真图"这一创新任务。失真图将配对图像视为基于区域的结构化拓扑,以紧凑可解释的图结构呈现失真类型、严重程度、对比关系和质量分数等密集退化信息。为实现失真图学习任务,我们贡献了:(i)区域级数据集PandaSet;(ii)具有不同区域级难度的基准测试集PandaBench;(iii)用于生成失真图的高效架构Panda。实验表明,当前最先进的多模态大语言模型即使获得显式区域提示,仍难以理解区域级退化现象,说明PandaBench构成了重大挑战。我们证明,通过PandaSet训练或采用失真图提示能激发模型对区域化失真的理解,为细粒度结构化图像对评估开辟了新方向。
English
In this work, we introduce a new perspective on comparative image assessment by representing an image pair as a structured composition of its regions. In contrast, existing methods focus on whole image analysis, while implicitly relying on region-level understanding. We extend the intra-image notion of a scene graph to inter-image, and propose a novel task of Distortion Graph (DG). DG treats paired images as a structured topology grounded in regions, and represents dense degradation information such as distortion type, severity, comparison and quality score in a compact interpretable graph structure. To realize the task of learning a distortion graph, we contribute (i) a region-level dataset, PandaSet, (ii) a benchmark suite, PandaBench, with varying region-level difficulty, and (iii) an efficient architecture, Panda, to generate distortion graphs. We demonstrate that PandaBench poses a significant challenge for state-of-the-art multimodal large language models (MLLMs) as they fail to understand region-level degradations even when fed with explicit region cues. We show that training on PandaSet or prompting with DG elicits region-wise distortion understanding, opening a new direction for fine-grained, structured pairwise image assessment.