美德機器:邁向人工通用科學
Virtuous Machines: Towards Artificial General Science
August 19, 2025
作者: Gabrielle Wehr, Reuben Rideaux, Amaya J. Fox, David R. Lightfoot, Jason Tangen, Jason B. Mattingley, Shane E. Ehrhardt
cs.AI
摘要
人工智慧系統正在改變科學發現的方式,透過加速特定研究任務,從蛋白質結構預測到材料設計,然而這些系統仍局限於需要大量人為監督的狹窄領域。科學文獻的指數級增長和日益增長的領域專業化限制了研究人員跨學科綜合知識和發展統一理論的能力,這促使我們探索更通用的科學用人工智慧系統。在此,我們展示了一種領域無關的、具自主性的AI系統,能夠獨立完成科學工作流程——從假設生成、數據收集到論文撰寫。該系統自主設計並執行了三項關於視覺工作記憶、心理旋轉和意象生動性的心理學研究,進行了一項涉及288名參與者的線上數據收集,通過超過8小時的連續編碼開發了分析流程,並完成了論文撰寫。結果表明,AI科學發現流程能夠進行具有理論推理和方法嚴謹性的非平凡研究,其水平可與經驗豐富的研究者相媲美,儘管在概念細微差別和理論解釋方面存在局限。這是朝著能夠通過現實世界實驗測試假設的具身AI邁出的一步,通過自主探索科學空間中人類認知和資源限制可能無法觸及的區域來加速發現。這引發了關於科學理解本質和科學貢獻歸屬的重要問題。
English
Artificial intelligence systems are transforming scientific discovery by
accelerating specific research tasks, from protein structure prediction to
materials design, yet remain confined to narrow domains requiring substantial
human oversight. The exponential growth of scientific literature and increasing
domain specialisation constrain researchers' capacity to synthesise knowledge
across disciplines and develop unifying theories, motivating exploration of
more general-purpose AI systems for science. Here we show that a
domain-agnostic, agentic AI system can independently navigate the scientific
workflow - from hypothesis generation through data collection to manuscript
preparation. The system autonomously designed and executed three psychological
studies on visual working memory, mental rotation, and imagery vividness,
executed one new online data collection with 288 participants, developed
analysis pipelines through 8-hour+ continuous coding sessions, and produced
completed manuscripts. The results demonstrate the capability of AI scientific
discovery pipelines to conduct non-trivial research with theoretical reasoning
and methodological rigour comparable to experienced researchers, though with
limitations in conceptual nuance and theoretical interpretation. This is a step
toward embodied AI that can test hypotheses through real-world experiments,
accelerating discovery by autonomously exploring regions of scientific space
that human cognitive and resource constraints might otherwise leave unexplored.
It raises important questions about the nature of scientific understanding and
the attribution of scientific credit.