SPAR:基於大型語言模型代理的學術論文檢索系統,以提升學術搜索效能
SPAR: Scholar Paper Retrieval with LLM-based Agents for Enhanced Academic Search
July 21, 2025
作者: Xiaofeng Shi, Yuduo Li, Qian Kou, Longbin Yu, Jinxin Xie, Hua Zhou
cs.AI
摘要
大型语言模型(LLMs)的最新进展为学术文献检索开辟了新的机遇。然而,现有系统往往依赖于僵化的流程,并表现出有限的推理能力。我们引入了SPAR,一个多代理框架,该框架结合了基于RefChain的查询分解和查询演化,以实现更灵活和有效的搜索。为了促进系统评估,我们还构建了SPARBench,这是一个具有专家标注相关性标签的具有挑战性的基准。实验结果表明,SPAR显著优于强大的基线,在AutoScholar上实现了高达+56%的F1分数,在SPARBench上比表现最佳的基线高出+23%的F1分数。SPAR和SPARBench共同为推进学术检索研究提供了一个可扩展、可解释且高性能的基础。代码和数据将在以下网址提供:https://github.com/xiaofengShi/SPAR。
English
Recent advances in large language models (LLMs) have opened new opportunities
for academic literature retrieval. However, existing systems often rely on
rigid pipelines and exhibit limited reasoning capabilities. We introduce SPAR,
a multi-agent framework that incorporates RefChain-based query decomposition
and query evolution to enable more flexible and effective search. To facilitate
systematic evaluation, we also construct SPARBench, a challenging benchmark
with expert-annotated relevance labels. Experimental results demonstrate that
SPAR substantially outperforms strong baselines, achieving up to +56% F1 on
AutoScholar and +23% F1 on SPARBench over the best-performing baseline.
Together, SPAR and SPARBench provide a scalable, interpretable, and
high-performing foundation for advancing research in scholarly retrieval. Code
and data will be available at: https://github.com/xiaofengShi/SPAR