정적인 (따라서 호환 가능한) 표현만 있으면 된다

초록

호환 가능한 표현 학습은 모델이 업데이트될 때마다 시간이 지남에 따라 상호 교환적으로 사용할 수 있는 특징 표현을 학습하는 것을 목표로 한다. 본 논문에서는 d-심플렉스 고정 분류기에 의해 학습된 고정 표현이 공식적 정의에서의 호환성을 의미함을 입증한다. 이 결과는 향후 연구를 위한 기반을 마련하며 실제 학습 시나리오에서 직접 활용될 수 있다. 우리는 모델이 순차적으로 미세 조정될 때 d-심플렉스 고정 분류기를 사용하여 호환성을 학습하는 과제를 다룬다. 교차 엔트로피 손실을 사용한 d-심플렉스 고정 분류기 학습은 1차 통계량에서 특징 분포를 정렬한다. 따라서 모델 업데이트 간 표현의 고차 의존성을 완전히 포착하지 못할 수 있다. 이 문제를 해결하기 위해, 교차 엔트로피 손실과 대조 손실의 볼록 결합을 통해 d-심플렉스 고정 분류기를 사용하여 모델을 학습하는 것이 고차 의존성을 포착할 뿐만 아니라 호환성 제약 하에서의 교차 엔트로피 학습과 동등함을 보여준다. 우리는 사전 학습된 모델이 순차적으로 미세 조정되고 때때로 개선된 모델로 대체되는 새로운 시나리오도 고려하여 광범위한 실험을 통해 발견한 내용을 확인한다. 고정 표현이 갤러리 이미지를 재처리하지 않고도 중단 없는 검색 서비스를 가능하게 하면서 모델 업데이트 및 교체 시 성능을 향상시켜 최첨단 성능을 달성함을 보여준다. 코드는 https://github.com/miccunifi/iamcl2r에서 확인할 수 있다.

English

Learning compatible representations aims to learn feature representations that can be used interchangeably over time whenever a model undergoes updates. In this paper, we demonstrate that stationary representations learned by d-Simplex fixed classifiers imply compatibility as in its formal definition. This result establishes a foundation for future works and can be directly exploited in practical learning scenarios. We address the challenge of learning compatibility using d-Simplex fixed classifiers when the model is sequentially fine-tuned. Learning according to a d-Simplex fixed classifier with the cross-entropy loss aligns feature distributions at the first-order statistics. Consequently, it may not fully capture higher-order dependencies in the representation between model updates. To address this issue, we demonstrate that training the model using a d-Simplex fixed classifier through a convex combination of the cross-entropy loss and a contrastive loss not only captures higher-order dependencies, but is also equivalent to learning with the cross-entropy under the compatibility constraints. We confirm our findings with extensive experiments also considering a new scenario where a pre-trained model is sequentially fine-tuned and occasionally replaced with an improved model. We show that stationary representations enable uninterrupted retrieval services (without reprocessing gallery images) while improving performance during model updates and replacements, achieving state-of-the-art. Code at https://github.com/miccunifi/iamcl2r.