ChatPaper.aiChatPaper

搜尋-R3:統一大型語言模型中的推理與嵌入生成

Search-R3: Unifying Reasoning and Embedding Generation in Large Language Models

October 8, 2025
作者: Yuntao Gui, James Cheng
cs.AI

摘要

尽管大型语言模型(LLMs)在自然语言理解方面展现出卓越的能力,但在检索任务中的应用却相对不足。我们提出了Search-R3这一创新框架,通过调整LLMs使其在推理过程中直接生成搜索嵌入,从而克服了这一局限。我们的方法充分利用了LLMs的思维链能力,使其能够通过逐步进行复杂的语义分析,产生更为有效的嵌入。这一目标通过三种互补机制实现:(1)监督学习阶段赋予模型生成高质量嵌入的能力;(2)强化学习(RL)方法在优化推理的同时优化嵌入生成;(3)专门的RL环境,无需在每次训练迭代时重新编码整个语料库,即可高效处理不断演变的嵌入表示。我们在多种基准测试上的广泛评估表明,Search-R3通过统一推理与嵌入生成过程,显著超越了以往的方法。这种集成的训练后处理方式,在处理既需要复杂推理又需高效信息检索的知识密集型任务上,标志着一次重大的进步。项目页面:https://github.com/ytgui/Search-R3
English
Despite their remarkable natural language understanding capabilities, Large Language Models (LLMs) have been underutilized for retrieval tasks. We present Search-R3, a novel framework that addresses this limitation by adapting LLMs to generate search embeddings as a direct output of their reasoning process. Our approach exploits LLMs' chain-of-thought capabilities, allowing them to produce more effective embeddings by reasoning step-by-step through complex semantic analyses. We implement this through three complementary mechanisms. (1) a supervised learning stage enables the model's ability to produce quality embeddings, (2) a reinforcement learning (RL) methodology that optimizes embedding generation alongside reasoning, and (3) a specialized RL environment that efficiently handles evolving embedding representations without requiring complete corpus re-encoding at each training iteration. Our extensive evaluations on diverse benchmarks demonstrate that Search-R3 significantly outperforms prior methods by unifying the reasoning and embedding generation processes. This integrated post-training approach represents a substantial advancement in handling complex knowledge-intensive tasks that require both sophisticated reasoning and effective information retrieval. Project page: https://github.com/ytgui/Search-R3
PDF22October 10, 2025