基于贝叶斯优化的高效且规范的科学发现:教程指南
Efficient and Principled Scientific Discovery through Bayesian Optimization: A Tutorial
April 1, 2026
作者: Zhongwei Yu, Rasul Tutunov, Alexandre Max Maraval, Zikai Xie, Zhenzhi Tan, Jiankang Wang, Zijing Li, Liangliang Xu, Qi Yang, Jun Jiang, Sanzhong Luo, Zhenxiao Guo, Haitham Bou-Ammar, Jun Wang
cs.AI
摘要
傳統科學發現依賴於假設-實驗-精煉的迭代循環,這一模式推動了數個世紀的科學進展,但其依賴直覺的臨時性實施方式常導致資源浪費、設計低效與關鍵洞察缺失。本教程系統闡述貝葉斯優化(BO)——一種以概率驅動的規範化框架,能將核心科學循環自動化與形式化。BO通過代理模型(如高斯過程)將實證觀測建模為動態演進的假設,並利用採集函數指導實驗選擇,在已知領域的開發與未知領域的探索間實現平衡,從而消除猜測與手動試錯。我們首先將科學發現構建為優化問題,繼而解析BO的核心組件、端到端工作流,並通過催化學、材料科學、有機合成及分子發現等案例展示其實際效能。教程還涵蓋面向科學應用的關鍵技術擴展,包括批量實驗、異方差處理、情境化優化及人機協同機制。本教程面向跨學科研究者,銜接BO的人工智能前沿進展與自然科學實踐應用,通過分層內容設計助力研究者設計更高效的實驗方案,推動規範化科學發現的進程加速。
English
Traditional scientific discovery relies on an iterative hypothesise-experiment-refine cycle that has driven progress for centuries, but its intuitive, ad-hoc implementation often wastes resources, yields inefficient designs, and misses critical insights. This tutorial presents Bayesian Optimisation (BO), a principled probability-driven framework that formalises and automates this core scientific cycle. BO uses surrogate models (e.g., Gaussian processes) to model empirical observations as evolving hypotheses, and acquisition functions to guide experiment selection, balancing exploitation of known knowledge and exploration of uncharted domains to eliminate guesswork and manual trial-and-error. We first frame scientific discovery as an optimisation problem, then unpack BO's core components, end-to-end workflows, and real-world efficacy via case studies in catalysis, materials science, organic synthesis, and molecule discovery. We also cover critical technical extensions for scientific applications, including batched experimentation, heteroscedasticity, contextual optimisation, and human-in-the-loop integration. Tailored for a broad audience, this tutorial bridges AI advances in BO with practical natural science applications, offering tiered content to empower cross-disciplinary researchers to design more efficient experiments and accelerate principled scientific discovery.