다중 벡터 인덱스 압축: 모든 모달리티에 적용 가능한 접근법

초록

우리는 모든 모달리티에서 후기 상호작용을 위한 효율적인 다중 벡터 검색을 연구한다. 후기 상호작용은 텍스트, 이미지, 시각적 문서, 비디오에서 정보 검색을 위한 주요 패러다임으로 부상했으나, 그 계산 및 저장 비용이 문서 길이에 따라 선형적으로 증가하여 이미지, 비디오, 오디오가 풍부한 코퍼스에서는 비용이 많이 든다. 이러한 한계를 해결하기 위해 우리는 고정된 벡터 예산 하에서 다중 벡터 문서 표현을 압축하는 질의-독립적인 방법을 탐구한다. 인덱스 압축을 위한 네 가지 접근법인 시퀀스 크기 조정, 메모리 토큰, 계층적 풀링, 그리고 새로운 주의-유도 클러스터링(AGC)을 소개한다. AGC는 주의-유도 메커니즘을 사용하여 문서의 의미론적으로 가장 salient한 영역을 클러스터 중심점으로 식별하고 토큰 집계에 가중치를 부여한다. 텍스트(BEIR), 시각적 문서(ViDoRe), 비디오(MSR-VTT, MultiVENT 2.0)에 걸친 검색 과제에서 이러한 방법들을 평가한 결과, 주의-유도 클러스터링이 다른 매개변수화된 압축 방법(시퀀스 크기 조정 및 메모리 토큰)을 일관되게 능가하며, 비모수적 계층적 클러스터링보다 인덱스 크기에서 더 큰 유연성을 제공하고, 압축되지 않은 전체 인덱스와 비교하여 경쟁력 있거나 향상된 성능을 달성함을 보여준다. 소스 코드는 github.com/hanxiangqin/omni-col-press에서 이용 가능하다.

English

We study efficient multi-vector retrieval for late interaction in any modality. Late interaction has emerged as a dominant paradigm for information retrieval in text, images, visual documents, and videos, but its computation and storage costs grow linearly with document length, making it costly for image-, video-, and audio-rich corpora. To address this limitation, we explore query-agnostic methods for compressing multi-vector document representations under a constant vector budget. We introduce four approaches for index compression: sequence resizing, memory tokens, hierarchical pooling, and a novel attention-guided clustering (AGC). AGC uses an attention-guided mechanism to identify the most semantically salient regions of a document as cluster centroids and to weight token aggregation. Evaluating these methods on retrieval tasks spanning text (BEIR), visual-document (ViDoRe), and video (MSR-VTT, MultiVENT 2.0), we show that attention-guided clustering consistently outperforms other parameterized compression methods (sequence resizing and memory tokens), provides greater flexibility in index size than non-parametric hierarchical clustering, and achieves competitive or improved performance compared to a full, uncompressed index. The source code is available at: github.com/hanxiangqin/omni-col-press.

다중 벡터 인덱스 압축: 모든 모달리티에 적용 가능한 접근법

Multi-Vector Index Compression in Any Modality

초록

Support