晚期交互动力学研究笔记：针对晚期交互模型的目标行为分析

摘要

尽管晚期交互模型展现出强大的检索性能，但其底层动态机制仍有诸多未解之处，可能隐藏着性能瓶颈。本研究聚焦于晚期交互检索中的两个议题：使用多向量评分时产生的长度偏差，以及经MaxSim算子池化后最佳分数之外的相似度分布。我们在NanoBEIR基准上对前沿模型的行为展开分析，结果表明：因果性晚期交互模型的理论长度偏差在实践中确实存在，而双向模型在极端情况下也会受其影响。同时我们发现，除文档标记的top-1相似度外，其余分数未呈现显著规律，这验证了MaxSim算子能有效利用标记级相似度评分。

English

While Late Interaction models exhibit strong retrieval performance, many of their underlying dynamics remain understudied, potentially hiding performance bottlenecks. In this work, we focus on two topics in Late Interaction retrieval: a length bias that arises when using multi-vector scoring, and the similarity distribution beyond the best scores pooled by the MaxSim operator. We analyze these behaviors for state-of-the-art models on the NanoBEIR benchmark. Results show that while the theoretical length bias of causal Late Interaction models holds in practice, bi-directional models can also suffer from it in extreme cases. We also note that no significant similarity trend lies beyond the top-1 document token, validating that the MaxSim operator efficiently exploits the token-level similarity scores.

晚期交互动力学研究笔记：针对晚期交互模型的目标行为分析

Working Notes on Late Interaction Dynamics: Analyzing Targeted Behaviors of Late Interaction Models

摘要

Support