ChatPaper.aiChatPaper

MoM:基於場景感知文檔記憶的混合模型,用於檢索增強生成系統

MoM: Mixtures of Scenario-Aware Document Memories for Retrieval-Augmented Generation Systems

October 16, 2025
作者: Jihao Zhao, Zhiyuan Ji, Simin Niu, Hanyu Wang, Feiyu Xiong, Zhiyu Li
cs.AI

摘要

傳統的RAG(檢索增強生成)範式,通常針對接收到的查詢進行相關文本片段的理解,這在內在限制了知識內化的深度及推理能力。為解決此限制,我們的研究將RAG中的文本處理從被動分塊轉變為主動理解,定義此過程為文檔記憶提取,旨在模擬人類閱讀時的認知過程。基於此,我們提出了情境感知文檔記憶混合(MoM)框架,旨在高效處理來自多個領域的文檔,並訓練小型語言模型(SLMs)以獲得主動探索與構建文檔記憶的能力。MoM首先指導大型語言模型(LLMs)模擬領域專家生成文檔邏輯大綱,從而引導結構化分塊與核心內容提取。它採用多路徑採樣與多視角評估機制,特別設計了代表片段清晰度與提取完整性的綜合指標,以選取最優文檔記憶。此外,為在SLMs訓練中注入更深層次的人類閱讀能力,我們引入了逆向推理策略,從高質量結果中推導出精煉的專家思維路徑。最後,利用MoM生成的多樣化內容形式,我們開發了一種基於概率建模理論證明的三層文檔記憶檢索機制。跨三個不同領域的大量實驗結果表明,MoM框架不僅解決了現有RAG系統中的文本分塊挑戰,為LLMs提供了語義完整的文檔記憶,還為SLMs實現以人為本的智能文本處理鋪平了道路。
English
The traditional RAG paradigm, which typically engages in the comprehension of relevant text chunks in response to received queries, inherently restricts both the depth of knowledge internalization and reasoning capabilities. To address this limitation, our research transforms the text processing in RAG from passive chunking to proactive understanding, defining this process as document memory extraction with the objective of simulating human cognitive processes during reading. Building upon this, we propose the Mixtures of scenario-aware document Memories (MoM) framework, engineered to efficiently handle documents from multiple domains and train small language models (SLMs) to acquire the ability to proactively explore and construct document memories. The MoM initially instructs large language models (LLMs) to simulate domain experts in generating document logical outlines, thereby directing structured chunking and core content extraction. It employs a multi-path sampling and multi-perspective evaluation mechanism, specifically designing comprehensive metrics that represent chunk clarity and extraction completeness to select the optimal document memories. Additionally, to infuse deeper human-like reading abilities during the training of SLMs, we incorporate a reverse reasoning strategy, which deduces refined expert thinking paths from high-quality outcomes. Finally, leveraging diverse forms of content generated by MoM, we develop a three-layer document memory retrieval mechanism, which is grounded in our theoretical proof from the perspective of probabilistic modeling. Extensive experimental results across three distinct domains demonstrate that the MoM framework not only resolves text chunking challenges in existing RAG systems, providing LLMs with semantically complete document memories, but also paves the way for SLMs to achieve human-centric intelligent text processing.
PDF22October 17, 2025