通过贝叶斯优化实现高效且规范的科学发现:教程指南
Efficient and Principled Scientific Discovery through Bayesian Optimization: A Tutorial
April 1, 2026
作者: Zhongwei Yu, Rasul Tutunov, Alexandre Max Maraval, Zikai Xie, Zhenzhi Tan, Jiankang Wang, Zijing Li, Liangliang Xu, Qi Yang, Jun Jiang, Sanzhong Luo, Zhenxiao Guo, Haitham Bou-Ammar, Jun Wang
cs.AI
摘要
传统科学发现依赖于延续数世纪的"假设-实验-优化"循环迭代模式,但其依赖直觉的临时性实施常导致资源浪费、设计低效与关键洞察缺失。本教程系统阐述贝叶斯优化(BO)——一种将这一核心科学循环形式化与自动化的概率驱动框架。BO通过代理模型(如高斯过程)将实证观测建模为动态演进的假设,利用采集函数指导实验选择,在已知领域开发与未知领域探索间实现平衡,从而消除猜测与人工试错。我们首先将科学发现构建为优化问题,继而解析BO的核心组件、端到端工作流,并通过催化、材料科学、有机合成及分子发现等案例展示其实际效能。教程还涵盖面向科学应用的关键技术扩展,包括批量实验、异方差处理、情境优化及人机协同集成。本教程面向广泛受众,将BO的人工智能进展与自然科学实践相衔接,通过分层内容设计助力跨学科研究者设计更高效的实验,推动范式化的科学发现进程。
English
Traditional scientific discovery relies on an iterative hypothesise-experiment-refine cycle that has driven progress for centuries, but its intuitive, ad-hoc implementation often wastes resources, yields inefficient designs, and misses critical insights. This tutorial presents Bayesian Optimisation (BO), a principled probability-driven framework that formalises and automates this core scientific cycle. BO uses surrogate models (e.g., Gaussian processes) to model empirical observations as evolving hypotheses, and acquisition functions to guide experiment selection, balancing exploitation of known knowledge and exploration of uncharted domains to eliminate guesswork and manual trial-and-error. We first frame scientific discovery as an optimisation problem, then unpack BO's core components, end-to-end workflows, and real-world efficacy via case studies in catalysis, materials science, organic synthesis, and molecule discovery. We also cover critical technical extensions for scientific applications, including batched experimentation, heteroscedasticity, contextual optimisation, and human-in-the-loop integration. Tailored for a broad audience, this tutorial bridges AI advances in BO with practical natural science applications, offering tiered content to empower cross-disciplinary researchers to design more efficient experiments and accelerate principled scientific discovery.