SPAR: 大規模言語モデルベースのエージェントによる学術論文検索の高度化

要旨

大規模言語モデル（LLMs）の最近の進展により、学術文献検索に新たな機会が開かれました。しかし、既存のシステムはしばしば硬直的なパイプラインに依存し、限定的な推論能力しか示しません。本論文では、より柔軟で効果的な検索を可能にするため、RefChainベースのクエリ分解とクエリ進化を組み込んだマルチエージェントフレームワークであるSPARを紹介します。体系的な評価を促進するため、専門家による関連性ラベルが付与された挑戦的なベンチマークであるSPARBenchも構築しました。実験結果は、SPARが強力なベースラインを大幅に上回り、AutoScholarでは最大+56%のF1スコア、SPARBenchでは+23%のF1スコアを達成することを示しています。SPARとSPARBenchは、学術検索の研究を進めるためのスケーラブルで解釈可能かつ高性能な基盤を提供します。コードとデータは以下で公開されます: https://github.com/xiaofengShi/SPAR

English

Recent advances in large language models (LLMs) have opened new opportunities for academic literature retrieval. However, existing systems often rely on rigid pipelines and exhibit limited reasoning capabilities. We introduce SPAR, a multi-agent framework that incorporates RefChain-based query decomposition and query evolution to enable more flexible and effective search. To facilitate systematic evaluation, we also construct SPARBench, a challenging benchmark with expert-annotated relevance labels. Experimental results demonstrate that SPAR substantially outperforms strong baselines, achieving up to +56% F1 on AutoScholar and +23% F1 on SPARBench over the best-performing baseline. Together, SPAR and SPARBench provide a scalable, interpretable, and high-performing foundation for advancing research in scholarly retrieval. Code and data will be available at: https://github.com/xiaofengShi/SPAR