ChatPaper.aiChatPaper

Xetrieval:密集检索的机制解释

Xetrieval: Mechanistically Explaining Dense Retrieval

May 28, 2026
作者: Zhixin Cai, Jun Bai, Yang Liu, Jiaqi Li, Yichi Zhang, Taichuan Li, Zhuofan Chen, Zixia Jia, Zilong Zheng, Wenge Rong
cs.AI

摘要

解释密集检索器为何赋予高相关性分数仍然具有挑战性,因为检索决策是通过不透明的高维嵌入做出的。现有解释通常侧重于表面信号,如词汇匹配、词元对齐或事后文本理由,因而对塑造密集检索行为在嵌入层面的潜在因素提供的见解有限。我们提出Xetrieval,一种面向嵌入层面的密集检索可解释性机制框架。Xetrieval首先引入轻量级推理内化器,通过单次前向传播直接在嵌入空间中近似思维链推理,在避免昂贵的自回归生成的同时,用推理导向信息丰富句子嵌入。随后,它将这些推理增强的嵌入分解为稀疏、可人工解释的特征,每个特征关联一个连贯的自然语言描述。通过聚合多个文档视角上的稀疏特征重叠,Xetrieval提供了单个检索决策的特征级解释。在多种检索器和基准上的实验表明,Xetrieval能发现连贯的可解释特征,产生更强的层级干预效果,并支持任务级特征引导。项目页面和源代码可在 https://hihiczx.github.io/Xetrieval 获取。
English
Explaining why dense retrievers assign high relevance scores remains challenging because retrieval decisions are made through opaque high-dimensional embeddings. Existing explanations often focus on surface signals, such as lexical matches, token alignments, or post-hoc textual rationales, and thus provide limited insight into the latent factors that shape dense retrieval behavior at the embedding level. We propose Xetrieval, an embedding-level mechanistic framework for explaining dense retrieval. Xetrieval first introduces a lightweight reasoning internalizer that approximates Chain-of-Thought reasoning directly in the embedding space with a single forward pass, enriching sentence embeddings with reasoning-oriented information while avoiding expensive autoregressive generation. It then decomposes these reasoning-enhanced embeddings into sparse, human-interpretable features, each associated with a coherent natural language description. By aggregating sparse feature overlaps across multiple document-side views, Xetrieval provides feature-level explanations of individual retrieval decisions. Experiments on diverse retrievers and benchmarks show that Xetrieval uncovers coherent interpretable features, yields stronger pair-level intervention effects, and supports task-level feature steering. The project page and source code are available at https://hihiczx.github.io/Xetrieval .