저수준 컴퓨터 비전을 위한 전역 광도 정합에 관한 연구

초록

감독 저수준 컴퓨터 비전 모델은 일반적으로 짝을 이룬 참조 영상에 대한 픽셀 단위 손실 함수에 의존하지만, 짝을 이룬 학습 데이터셋은 쌍마다 광도 측정 불일치(photometric inconsistency)를 보입니다. 즉, 서로 다른 영상 쌍은 서로 다른 전반적 밝기, 색상, 또는 화이트 밸런스 매핑을 요구합니다. 이러한 불일치는 작업 고유의 광도 측정 변환(예: 저조도 향상)이나 의도하지 않은 획득 차이(예: 비 제거)를 통해 발생하며, 어느 경우든 최적화 병리 현상(optimization pathology)을 유발합니다. 표준 재구성 손실 함수는 상충하는 쌍별 광도 측정 목표에 불균형적으로 많은 그래디언트 예산을 할당하여 내용 복원(content restoration)을 압박합니다. 본 논문에서는 이 문제를 분석하고, 최소제곱 분해 하에서 예측-목표 잔차의 광도 측정 구성 요소와 구조적 구성 요소가 직교하며, 공간적으로 조밀한 광도 측정 구성 요소가 그래디언트 에너지를 지배함을 증명합니다. 이러한 분석에 기반하여 우리는 광도 측정 정렬 손실(Photometric Alignment Loss, PAL)을 제안합니다. 이 유연한 감독 목적 함수는 폐쇄형 아핀 색상 정렬(affine color alignment)을 통해 불필요한 광도 측정 차이를 할인하면서 복원 관련 감독 정보는 보존하며, 공분산 통계와 미미한 오버헤드만 있는 소규모 행렬 역연산만을 요구합니다. 6개 작업, 16개 데이터셋, 16개 아키텍처에 걸친 실험에서 PAL은 지표와 일반화 성능을 지속적으로 향상시켰습니다. 구현 내용은 부록에 제시되어 있습니다.

English

Supervised low-level vision models rely on pixel-wise losses against paired references, yet paired training sets exhibit per-pair photometric inconsistency, say, different image pairs demand different global brightness, color, or white-balance mappings. This inconsistency enters through task-intrinsic photometric transfer (e.g., low-light enhancement) or unintended acquisition shifts (e.g., de-raining), and in either case causes an optimization pathology. Standard reconstruction losses allocate disproportionate gradient budget to conflicting per-pair photometric targets, crowding out content restoration. In this paper, we investigate this issue and prove that, under least-squares decomposition, the photometric and structural components of the prediction-target residual are orthogonal, and that the spatially dense photometric component dominates the gradient energy. Motivated by this analysis, we propose Photometric Alignment Loss (PAL). This flexible supervision objective discounts nuisance photometric discrepancy via closed-form affine color alignment while preserving restoration-relevant supervision, requiring only covariance statistics and tiny matrix inversion with negligible overhead. Across 6 tasks, 16 datasets, and 16 architectures, PAL consistently improves metrics and generalization. The implementation is in the appendix.

저수준 컴퓨터 비전을 위한 전역 광도 정합에 관한 연구

On the Global Photometric Alignment for Low-Level Vision

초록

Support