ChatPaper.aiChatPaper

SciDER:科学数据驱动的端到端研究平台

SciDER: Scientific Data-centric End-to-end Researcher

March 2, 2026
作者: Ke Lin, Yilin Lu, Shreyas Bhat, Xuehang Guo, Junier Oliva, Qingyun Wang
cs.AI

摘要

基於大型語言模型的自動化科學發現正在重塑從構思到實驗的完整研究週期,然而現有智能體仍難以自主處理科學實驗收集的原始數據。我們推出以數據為核心的端到端系統SciDER,實現研究週期的全流程自動化。有別於傳統框架,本系統通過專業化智能體協作解析分析原始科學數據,基於特定數據特徵生成假說與實驗設計,並撰寫執行對應代碼。在三個基準測試中的評估表明,SciDER憑藉其自我進化的記憶模塊與評判主導的反饋迴路,在專業化數據驅動的科學發現任務中表現卓越,其性能超越通用型智能體與現有最先進模型。該系統以模塊化Python套件形式分發,我們同時提供帶輕量級網頁界面的易用PyPI軟體包,旨在加速自主數據驅動的科研進程,讓所有研究人員與開發者都能無門檻使用。
English
Automated scientific discovery with large language models is transforming the research lifecycle from ideation to experimentation, yet existing agents struggle to autonomously process raw data collected from scientific experiments. We introduce SciDER, a data-centric end-to-end system that automates the research lifecycle. Unlike traditional frameworks, our specialized agents collaboratively parse and analyze raw scientific data, generate hypotheses and experimental designs grounded in specific data characteristics, and write and execute corresponding code. Evaluation on three benchmarks shows SciDER excels in specialized data-driven scientific discovery and outperforms general-purpose agents and state-of-the-art models through its self-evolving memory and critic-led feedback loop. Distributed as a modular Python package, we also provide easy-to-use PyPI packages with a lightweight web interface to accelerate autonomous, data-driven research and aim to be accessible to all researchers and developers.
PDF62May 8, 2026