视觉质量-R1:通过强化学习排序实现推理引导的图像质量评估
VisualQuality-R1: Reasoning-Induced Image Quality Assessment via Reinforcement Learning to Rank
May 20, 2025
作者: Tianhe Wu, Jian Zou, Jie Liang, Lei Zhang, Kede Ma
cs.AI
摘要
DeepSeek-R1在通过强化学习激励大型语言模型(LLMs)的推理与泛化能力方面展现了显著成效。然而,在极度依赖视觉推理的图像质量评估(IQA)任务中,推理驱动的计算建模潜力尚未得到充分探索。本文提出VisualQuality-R1,一种推理引导的无参考IQA(NR-IQA)模型,并采用专为视觉质量内在相对性设计的强化学习排序算法进行训练。具体而言,对于一对图像,我们运用群体相对策略优化为每幅图像生成多个质量评分,随后基于Thurstone模型计算一幅图像质量高于另一幅的比较概率。每个质量估计的奖励采用连续保真度度量而非离散二元标签定义。大量实验表明,所提出的VisualQuality-R1在性能上持续超越基于判别式深度学习的NR-IQA模型及近期一项推理引导的质量回归方法。此外,VisualQuality-R1能够生成上下文丰富、与人类感知一致的质量描述,并支持无需感知尺度重新对齐的多数据集训练。这些特性使得VisualQuality-R1特别适用于可靠衡量超分辨率、图像生成等广泛图像处理任务的进展。
English
DeepSeek-R1 has demonstrated remarkable effectiveness in incentivizing
reasoning and generalization capabilities of large language models (LLMs)
through reinforcement learning. Nevertheless, the potential of
reasoning-induced computational modeling has not been thoroughly explored in
the context of image quality assessment (IQA), a task critically dependent on
visual reasoning. In this paper, we introduce VisualQuality-R1, a
reasoning-induced no-reference IQA (NR-IQA) model, and we train it with
reinforcement learning to rank, a learning algorithm tailored to the
intrinsically relative nature of visual quality. Specifically, for a pair of
images, we employ group relative policy optimization to generate multiple
quality scores for each image. These estimates are then used to compute
comparative probabilities of one image having higher quality than the other
under the Thurstone model. Rewards for each quality estimate are defined using
continuous fidelity measures rather than discretized binary labels. Extensive
experiments show that the proposed VisualQuality-R1 consistently outperforms
discriminative deep learning-based NR-IQA models as well as a recent
reasoning-induced quality regression method. Moreover, VisualQuality-R1 is
capable of generating contextually rich, human-aligned quality descriptions,
and supports multi-dataset training without requiring perceptual scale
realignment. These features make VisualQuality-R1 especially well-suited for
reliably measuring progress in a wide range of image processing tasks like
super-resolution and image generation.Summary
AI-Generated Summary