SproutRAG: 注意誘導型の段階的埋め込みを用いた長文書RAGのための木探索

要旨

検索拡張生成（RAG）システムは、検索の粒度と文脈の一貫性のバランスを取る必要がある。既存手法では、LLMによるチャンク分割、単一レベルの文脈拡張、または階層的要約によってこの課題に対処している。これらのアプローチは、インデックス作成や検索時に高コストなLLM呼び出しに依存したり、文脈集約を単一の粒度レベルに制限したり、要約による情報損失を引き起こしたりする。本稿では、SproutRAGを提案する。これは注意誘導型の階層的RAGフレームワークであり、学習された文間注意を用いて二分木状のチャンク木を構築し、文レベルのチャンクを次第に大きくなるが意味的に一貫した単位に整理することで、このトレードオフに対処する。外部LLMや固定文脈拡張、損失を伴う要約に依存する従来手法とは異なり、SproutRAGはどの注意ヘッドと層が文書の意味構造を最もよく捉えるかを学習し、追加のLLM呼び出しや圧縮された要約なしに多粒度検索を可能にする。検索時には、SproutRAGは階層的ビームサーチを使用して複数の粒度で候補を取得し、フラットな検索を超えた複数文の関連性を捉える。フレームワークは、埋め込みと木構造の両方を改善する共同目的関数によってエンドツーエンドで学習される。科学、法律、オープンドメイン設定にわたる4つのベンチマークでの実験により、SproutRAGは最強のベースラインと比較して情報効率（IE）を平均6.1%向上させることが示された。コードはhttps://github.com/AmirAbaskohi/SproutRAGで入手可能である。

English

Retrieval-augmented generation (RAG) systems must balance retrieval granularity with contextual coherence, a challenge that existing methods address through LLM-guided chunking, single-level context expansion, or hierarchical summarization. These approaches variously depend on costly LLM calls during indexing or retrieval, limit context aggregation to a single granularity level, or introduce information loss through summarization. We present SproutRAG, an attention-guided hierarchical RAG framework that addresses this trade-off by organizing sentence-level chunks into progressively larger but semantically coherent units, using learned inter-sentence attention to construct a binary chunking tree. Unlike prior approaches that rely on external LLMs, fixed context expansion, or lossy summarization, SproutRAG learns which attention heads and layers best capture semantic document structure, enabling multi-granularity retrieval without additional LLM calls or compressed summaries. At retrieval time, SproutRAG uses hierarchical beam search to retrieve candidates at multiple granularities, capturing multi-sentence relevance beyond flat retrieval. The framework is trained end-to-end with a joint objective that improves both embeddings and tree structure. Experiments across four benchmarks spanning scientific, legal, and open-domain settings demonstrate that SproutRAG improves information efficiency (IE) by 6.1% on average over the strongest baseline. Code is available on https://github.com/AmirAbaskohi/SproutRAG.