LaSER：将显式推理内化至潜在空间的稠密检索方法

摘要

大型语言模型（LLMs）从根本上革新了稠密检索技术，将系统主干从判别式编码器升级为生成式架构。然而存在一个关键断层：尽管LLMs具备强大的推理能力，当前检索模型主要将其作为静态编码器使用，未能发掘其处理复杂推理的潜力。为解决这一问题，现有方法通常采用"重写后检索"流程，在检索前生成显式的思维链推理依据，但这会导致难以承受的延迟。本文提出LaSER创新框架，通过自蒸馏技术将显式推理内化到稠密检索器的隐式空间中。基于共享的LLM主干，LaSER引入双视角训练机制：显式视角对真实推理路径进行显式编码，隐式视角则执行潜在的隐式思考。为弥合两个视角间的鸿沟，我们设计了多粒度对齐策略——除标准输出对齐外，创新性地引入轨迹对齐机制，使隐式路径的中间潜在状态与显式推理片段的语义演进保持同步。这使得检索器无需自回归文本生成即可实现高效静默思考。在领域内和领域外推理密集型基准测试上的大量实验表明，LaSER显著优于现有最优基线方法。此外，跨不同主干网络和模型规模的综合分析验证了我们方法的鲁棒性，证实统一学习框架对于激发有效隐式思考的关键作用。我们的方法成功融合了显式思维链流程的推理深度与标准稠密检索器的推断效率。

English

LLMs have fundamentally transformed dense retrieval, upgrading backbones from discriminative encoders to generative architectures. However, a critical disconnect remains: while LLMs possess strong reasoning capabilities, current retrievers predominantly utilize them as static encoders, leaving their potential for complex reasoning unexplored. To address this, existing approaches typically adopt rewrite-then-retrieve pipelines to generate explicit CoT rationales before retrieval. However, this incurs prohibitive latency. In this paper, we propose LaSER, a novel self-distillation framework that internalizes explicit reasoning into the latent space of dense retrievers. Operating on a shared LLM backbone, LaSER introduces a dual-view training mechanism: an Explicit view that explicitly encodes ground-truth reasoning paths, and a Latent view that performs implicit latent thinking. To bridge the gap between these views, we design a multi-grained alignment strategy. Beyond standard output alignment, we introduce a trajectory alignment mechanism that synchronizes the intermediate latent states of the latent path with the semantic progression of the explicit reasoning segments. This allows the retriever to think silently and effectively without autoregressive text generation. Extensive experiments on both in-domain and out-of-domain reasoning-intensive benchmarks demonstrate that LaSER significantly outperforms state-of-the-art baselines. Furthermore, analyses across diverse backbones and model scales validate the robustness of our approach, confirming that our unified learning framework is essential for eliciting effective latent thinking. Our method successfully combines the reasoning depth of explicit CoT pipelines with the inference efficiency of standard dense retrievers.

LaSER：将显式推理内化至潜在空间的稠密检索方法

LaSER: Internalizing Explicit Reasoning into Latent Space for Dense Retrieval

摘要

Support