A-RAG:通过分层检索接口实现代理式检索增强生成的规模化应用
A-RAG: Scaling Agentic Retrieval-Augmented Generation via Hierarchical Retrieval Interfaces
February 3, 2026
作者: Mingxuan Du, Benfeng Xu, Chiwei Zhu, Shaohan Wang, Pengyu Wang, Xiaorui Wang, Zhendong Mao
cs.AI
摘要
前沿语言模型已展现出强大的推理能力和长程工具使用能力。然而现有检索增强生成系统未能有效利用这些能力,仍依赖两种范式:一是设计单次检索段落并拼接至模型输入的算法;二是预定义工作流程并引导模型逐步执行。这两种范式均未让模型参与检索决策,导致系统无法随模型能力提升而高效扩展。本文提出A-RAG框架,将分层检索接口直接开放给模型使用。该框架提供关键词搜索、语义搜索和分块阅读三种检索工具,使智能体能够跨多粒度自适应搜索信息。在多个开放域问答基准测试中,A-RAG以相当或更少的检索标记量持续超越现有方法,证明其能有效利用模型能力并动态适应不同RAG任务。我们进一步系统研究了A-RAG随模型规模与测试时计算量的扩展规律。代码与评估套件已开源:https://github.com/Ayanami0730/arag。
English
Frontier language models have demonstrated strong reasoning and long-horizon tool-use capabilities. However, existing RAG systems fail to leverage these capabilities. They still rely on two paradigms: (1) designing an algorithm that retrieves passages in a single shot and concatenates them into the model's input, or (2) predefining a workflow and prompting the model to execute it step-by-step. Neither paradigm allows the model to participate in retrieval decisions, preventing efficient scaling with model improvements. In this paper, we introduce A-RAG, an Agentic RAG framework that exposes hierarchical retrieval interfaces directly to the model. A-RAG provides three retrieval tools: keyword search, semantic search, and chunk read, enabling the agent to adaptively search and retrieve information across multiple granularities. Experiments on multiple open-domain QA benchmarks show that A-RAG consistently outperforms existing approaches with comparable or lower retrieved tokens, demonstrating that A-RAG effectively leverages model capabilities and dynamically adapts to different RAG tasks. We further systematically study how A-RAG scales with model size and test-time compute. We will release our code and evaluation suite to facilitate future research. Code and evaluation suite are available at https://github.com/Ayanami0730/arag.