关于低层级视觉中的全局光度对齐研究

摘要

监督式低层视觉模型依赖于针对配对参考图像的逐像素损失函数，但配对训练集存在每对样本间的光度不一致性问题，即不同图像对需要不同的全局亮度、色彩或白平衡映射。这种不一致性可能源自任务固有的光度转换（如低光增强）或无意的采集偏差（如去雨），无论哪种情况都会导致优化异常。标准重建损失会将不成比例的梯度预算分配给相互冲突的每对光度目标，从而挤占内容恢复的优化资源。本文通过理论证明：在最小二乘分解下，预测目标残差的光度分量与结构分量正交，且空间密集的光度分量主导了梯度能量。基于此分析，我们提出光度对齐损失（PAL）。该灵活监督目标通过闭式仿射色彩对齐消除干扰性的光度差异，同时保留与图像复原相关的监督信号，仅需协方差统计量和可忽略开销的微型矩阵求逆操作。在6类任务、16个数据集和16种架构上的实验表明，PAL能持续提升指标性能和泛化能力。具体实现详见附录。

English

Supervised low-level vision models rely on pixel-wise losses against paired references, yet paired training sets exhibit per-pair photometric inconsistency, say, different image pairs demand different global brightness, color, or white-balance mappings. This inconsistency enters through task-intrinsic photometric transfer (e.g., low-light enhancement) or unintended acquisition shifts (e.g., de-raining), and in either case causes an optimization pathology. Standard reconstruction losses allocate disproportionate gradient budget to conflicting per-pair photometric targets, crowding out content restoration. In this paper, we investigate this issue and prove that, under least-squares decomposition, the photometric and structural components of the prediction-target residual are orthogonal, and that the spatially dense photometric component dominates the gradient energy. Motivated by this analysis, we propose Photometric Alignment Loss (PAL). This flexible supervision objective discounts nuisance photometric discrepancy via closed-form affine color alignment while preserving restoration-relevant supervision, requiring only covariance statistics and tiny matrix inversion with negligible overhead. Across 6 tasks, 16 datasets, and 16 architectures, PAL consistently improves metrics and generalization. The implementation is in the appendix.