搜尋-R3：統一大型語言模型中的推理與嵌入生成

摘要

尽管大型语言模型（LLMs）在自然语言理解方面展现出卓越的能力，但在检索任务中的应用却相对不足。我们提出了Search-R3这一创新框架，通过调整LLMs使其在推理过程中直接生成搜索嵌入，从而克服了这一局限。我们的方法充分利用了LLMs的思维链能力，使其能够通过逐步进行复杂的语义分析，产生更为有效的嵌入。这一目标通过三种互补机制实现：（1）监督学习阶段赋予模型生成高质量嵌入的能力；（2）强化学习（RL）方法在优化推理的同时优化嵌入生成；（3）专门的RL环境，无需在每次训练迭代时重新编码整个语料库，即可高效处理不断演变的嵌入表示。我们在多种基准测试上的广泛评估表明，Search-R3通过统一推理与嵌入生成过程，显著超越了以往的方法。这种集成的训练后处理方式，在处理既需要复杂推理又需高效信息检索的知识密集型任务上，标志着一次重大的进步。项目页面：https://github.com/ytgui/Search-R3

English

Despite their remarkable natural language understanding capabilities, Large Language Models (LLMs) have been underutilized for retrieval tasks. We present Search-R3, a novel framework that addresses this limitation by adapting LLMs to generate search embeddings as a direct output of their reasoning process. Our approach exploits LLMs' chain-of-thought capabilities, allowing them to produce more effective embeddings by reasoning step-by-step through complex semantic analyses. We implement this through three complementary mechanisms. (1) a supervised learning stage enables the model's ability to produce quality embeddings, (2) a reinforcement learning (RL) methodology that optimizes embedding generation alongside reasoning, and (3) a specialized RL environment that efficiently handles evolving embedding representations without requiring complete corpus re-encoding at each training iteration. Our extensive evaluations on diverse benchmarks demonstrate that Search-R3 significantly outperforms prior methods by unifying the reasoning and embedding generation processes. This integrated post-training approach represents a substantial advancement in handling complex knowledge-intensive tasks that require both sophisticated reasoning and effective information retrieval. Project page: https://github.com/ytgui/Search-R3

搜尋-R3：統一大型語言模型中的推理與嵌入生成

Search-R3: Unifying Reasoning and Embedding Generation in Large Language Models

摘要

Support