ChatPaper.aiChatPaper

Col-Bandit:面向延迟交互检索的零样本查询时剪枝技术

Col-Bandit: Zero-Shot Query-Time Pruning for Late-Interaction Retrieval

February 2, 2026
作者: Roi Pony, Adi Raz, Oshri Naparstek, Idan Friedman, Udi Barzelay
cs.AI

摘要

多向量延迟交互检索器(如ColBERT)虽能实现最先进的检索质量,但其查询时成本主要消耗在对每个候选文档进行详尽的令牌级MaxSim交互计算。虽然通过单向量表示近似延迟交互可降低计算成本,但往往会导致准确率显著下降。我们提出Col-Bandit算法,该查询时剪枝算法通过将重排序转化为有限总体Top-K识别问题来减轻计算负担。Col-Bandit基于部分观测的文档分数维护不确定性感知边界,并自适应地仅揭示在可调松弛度的统计决策边界下确定顶级结果所需的(文档,查询令牌)MaxSim条目。与离线剪枝整个文档或令牌的粗粒度方法不同,Col-Bandit可动态稀疏化交互矩阵。该算法作为标准多向量系统的零样本即插即用层,无需修改索引、离线预处理或模型重训练。在文本(BEIR)和多模态(REAL-MM-RAG)基准测试表明,Col-Bandit在将MaxSim浮点运算量降低高达5倍的同时保持了排序保真度,证明稠密延迟交互评分存在大量冗余,可在查询时被有效识别并剪枝。
English
Multi-vector late-interaction retrievers such as ColBERT achieve state-of-the-art retrieval quality, but their query-time cost is dominated by exhaustively computing token-level MaxSim interactions for every candidate document. While approximating late interaction with single-vector representations reduces cost, it often incurs substantial accuracy loss. We introduce Col-Bandit, a query-time pruning algorithm that reduces this computational burden by casting reranking as a finite-population Top-K identification problem. Col-Bandit maintains uncertainty-aware bounds over partially observed document scores and adaptively reveals only the (document, query token) MaxSim entries needed to determine the top results under statistical decision bounds with a tunable relaxation. Unlike coarse-grained approaches that prune entire documents or tokens offline, Col-Bandit sparsifies the interaction matrix on the fly. It operates as a zero-shot, drop-in layer over standard multi-vector systems, requiring no index modifications, offline preprocessing, or model retraining. Experiments on textual (BEIR) and multimodal (REAL-MM-RAG) benchmarks show that Col-Bandit preserves ranking fidelity while reducing MaxSim FLOPs by up to 5times, indicating that dense late-interaction scoring contains substantial redundancy that can be identified and pruned efficiently at query time.
PDF12February 11, 2026