AgentSLR：基于智能体人工智能的流行病学系统文献综述自动化平台

摘要

系统文献综述对于整合科学证据至关重要，但存在成本高昂、难以规模化且耗时较长的问题，这为循证决策形成了瓶颈。本研究旨在探究大型语言模型能否实现从文献检索、文章筛选、数据提取到报告合成的全流程系统综述自动化。针对世卫组织指定的九种优先病原体流行病学综述，我们开发的开源智能体管道（AgentSLR）在专家标注真实数据验证中表现出与人类研究者相当的效能，同时将综述时长从约7周缩短至20小时（效率提升58倍）。通过对五种前沿模型的比较研究，我们发现系统综述任务的性能差异主要源于各模型的独特能力，而非模型规模或推理成本。借助人机协同验证机制，我们识别出关键失效模式。研究结果表明，智能体人工智能能显著加速专业领域的科学证据整合进程。

English

Systematic literature reviews are essential for synthesizing scientific evidence but are costly, difficult to scale and time-intensive, creating bottlenecks for evidence-based policy. We study whether large language models can automate the complete systematic review workflow, from article retrieval, article screening, data extraction to report synthesis. Applied to epidemiological reviews of nine WHO-designated priority pathogens and validated against expert-curated ground truth, our open-source agentic pipeline (AgentSLR) achieves performance comparable to human researchers while reducing review time from approximately 7 weeks to 20 hours (a 58x speed-up). Our comparison of five frontier models reveals that performance on SLR is driven less by model size or inference cost than by each model's distinctive capabilities. Through human-in-the-loop validation, we identify key failure modes. Our results demonstrate that agentic AI can substantially accelerate scientific evidence synthesis in specialised domains.

AgentSLR：基于智能体人工智能的流行病学系统文献综述自动化平台

AgentSLR: Automating Systematic Literature Reviews in Epidemiology with Agentic AI

摘要

Support