LaSER: 밀집 검색을 위한 명시적 추론의 잠재 공간 내재화

초록

LLM은 밀집 검색을 근본적으로 변화시켜 백본을 판별적 인코더에서 생성형 아키텍처로 업그레이드했습니다. 그러나 중요한 단절이 남아있습니다: LLM은 강력한 추론 능력을 보유함에도 불구하고, 현재 검색 모델은 주로 이를 정적 인코더로 활용하여 복잡한 추론에 대한 잠재력을 탐구하지 못하고 있습니다. 이를 해결하기 위해 기존 접근법은 일반적으로 검색 전 명시적 CoT 추론 과정을 생성하는 rewrite-then-retrieve 파이프라인을 채택합니다. 하지만 이는 과도한 지연 시간을 초래합니다. 본 논문에서는 명시적 추론을 밀집 검색기의 잠재 공간에 내재화하는 새로운 자기 지식 증류 프레임워크인 LaSER를 제안합니다. 공통의 LLM 백본에서 운영되는 LaSER는 이중 관점 학습 메커니즘을 도입합니다: 실제 추론 경로를 명시적으로 인코딩하는 Explicit 관점과 암묵적인 잠재 사고를 수행하는 Latent 관점입니다. 이러한 관점 간 차이를 해소하기 위해 우리는 다중 수준 정렬 전략을 설계했습니다. 표준 출력 정렬을 넘어서, 우리는 중간 잠재 상태를 명시적 추론 세그먼트의 의미적 진행과 동기화하는 궤적 정렬 메커니즘을 도입합니다. 이를 통해 검색기는 자동 회귀 텍스트 생성 없이도 침묵하면서 효과적으로 사고할 수 있습니다. 도메인 내 및 도메인 외 추론 집약 벤치마크에서의 광범위한 실험을 통해 LaSER가 최첨단 기준선을 크게 능가함을 입증했습니다. 더 나아가 다양한 백본과 모델 규모에 걸친 분석을 통해 우리 접근법의 강건성이 검증되었으며, 통합 학습 프레임워크가 효과적인 잠재 사고를 이끌어내는 데 필수적임을 확인했습니다. 우리의 방법은 명시적 CoT 파이프라인의 추론 깊이와 표준 밀집 검색기의 추론 효율성을 성공적으로 결합합니다.

English

LLMs have fundamentally transformed dense retrieval, upgrading backbones from discriminative encoders to generative architectures. However, a critical disconnect remains: while LLMs possess strong reasoning capabilities, current retrievers predominantly utilize them as static encoders, leaving their potential for complex reasoning unexplored. To address this, existing approaches typically adopt rewrite-then-retrieve pipelines to generate explicit CoT rationales before retrieval. However, this incurs prohibitive latency. In this paper, we propose LaSER, a novel self-distillation framework that internalizes explicit reasoning into the latent space of dense retrievers. Operating on a shared LLM backbone, LaSER introduces a dual-view training mechanism: an Explicit view that explicitly encodes ground-truth reasoning paths, and a Latent view that performs implicit latent thinking. To bridge the gap between these views, we design a multi-grained alignment strategy. Beyond standard output alignment, we introduce a trajectory alignment mechanism that synchronizes the intermediate latent states of the latent path with the semantic progression of the explicit reasoning segments. This allows the retriever to think silently and effectively without autoregressive text generation. Extensive experiments on both in-domain and out-of-domain reasoning-intensive benchmarks demonstrate that LaSER significantly outperforms state-of-the-art baselines. Furthermore, analyses across diverse backbones and model scales validate the robustness of our approach, confirming that our unified learning framework is essential for eliciting effective latent thinking. Our method successfully combines the reasoning depth of explicit CoT pipelines with the inference efficiency of standard dense retrievers.

LaSER: 밀집 검색을 위한 명시적 추론의 잠재 공간 내재화

LaSER: Internalizing Explicit Reasoning into Latent Space for Dense Retrieval

초록

Support