ChatPaper.aiChatPaper

DeepScientist:推动前沿科学发现的持续进步

DeepScientist: Advancing Frontier-Pushing Scientific Findings Progressively

September 30, 2025
作者: Yixuan Weng, Minjun Zhu, Qiujie Xie, Qiyao Sun, Zhen Lin, Sifan Liu, Yue Zhang
cs.AI

摘要

尽管以往的AI科学家系统能够产生新颖的发现,但它们往往缺乏聚焦,难以针对人类定义的紧迫挑战做出具有科学价值的贡献。我们推出了DeepScientist系统,旨在通过执行目标导向、完全自主的科学发现过程,跨越长达数月的时间线,来克服这一局限。该系统将发现过程形式化为一个贝叶斯优化问题,并通过“假设、验证、分析”这一层次化评估流程加以实施。借助累积的发现记忆库,这一循环智能地平衡了对新假设的探索与利用,有选择地将最有前景的发现提升至更高保真度的验证层级。在消耗超过20,000 GPU小时的计算资源后,该系统生成了约5,000个独特的科学构想,并实验验证了其中约1,100个,最终在三个前沿AI任务上分别以183.7%、1.9%和7.9%的幅度超越了人类设计的最先进(SOTA)方法。这项研究首次大规模证明了AI在科学任务上逐步超越人类SOTA的发现能力,产出了真正推动科学发现前沿的有价值成果。为促进这一过程的进一步研究,我们将在https://github.com/ResearAI/DeepScientist/开源所有实验日志和系统代码。
English
While previous AI Scientist systems can generate novel findings, they often lack the focus to produce scientifically valuable contributions that address pressing human-defined challenges. We introduce DeepScientist, a system designed to overcome this by conducting goal-oriented, fully autonomous scientific discovery over month-long timelines. It formalizes discovery as a Bayesian Optimization problem, operationalized through a hierarchical evaluation process consisting of "hypothesize, verify, and analyze". Leveraging a cumulative Findings Memory, this loop intelligently balances the exploration of novel hypotheses with exploitation, selectively promoting the most promising findings to higher-fidelity levels of validation. Consuming over 20,000 GPU hours, the system generated about 5,000 unique scientific ideas and experimentally validated approximately 1100 of them, ultimately surpassing human-designed state-of-the-art (SOTA) methods on three frontier AI tasks by 183.7\%, 1.9\%, and 7.9\%. This work provides the first large-scale evidence of an AI achieving discoveries that progressively surpass human SOTA on scientific tasks, producing valuable findings that genuinely push the frontier of scientific discovery. To facilitate further research into this process, we will open-source all experimental logs and system code at https://github.com/ResearAI/DeepScientist/.
PDF123October 1, 2025