EvoMaster：大規模に進化する自律的科学エージェントを構築するための基盤的エージェントフレームワーク

要旨

大規模言語モデルとエージェントの融合は、科学発見の新たな時代「エージェント主導型科学」を触媒している。科学的方法論は本質的に反復的であるが、既存のエージェントフレームワークは静的な設計が主流で、適用範囲が狭く、試行錯誤から学習する能力を欠いている。この課題を解決するため、我々は大規模なエージェント主導型科学のための基盤的進化エージェントフレームワーク「EvoMaster」を提案する。継続的な自己進化を中核原理とするEvoMasterは、エージェントが仮説を反復的に改良し、自己批判を行い、実験サイクルを通じて知識を累積することを可能にし、人間の科学的探求を忠実に再現する。重要なのは、EvoMasterがドメイン非依存の基盤フレームワークとして設計されているため、極めて容易にスケールアップ可能である点だ。開発者は約100行のコードで任意の分野向けの高機能な自己進化型科学エージェントを構築・展開できる。EvoMaster上に構築したSciMasterエコシステムは、機械学習、物理学、一般科学などの分野で実証された。4つの権威あるベンチマーク（Humanity's Last Exam、MLE-Bench Lite、BrowseComp、FrontierScience）による評価では、EvoMasterはそれぞれ41.1%、75.8%、73.3%、53.3%の最先端スコアを達成。汎用ベースラインのOpenClawを+159%から+316%の相対改善で包括的に上回り、次世代自律科学発見の基盤フレームワークとしての有効性と汎用性を強固に立証した。EvoMasterはhttps://github.com/sjtu-sai-agents/EvoMaster で公開されている。

English

The convergence of large language models and agents is catalyzing a new era of scientific discovery: Agentic Science. While the scientific method is inherently iterative, existing agent frameworks are predominantly static, narrowly scoped, and lack the capacity to learn from trial and error. To bridge this gap, we present EvoMaster, a foundational evolving agent framework engineered specifically for Agentic Science at Scale. Driven by the core principle of continuous self-evolution, EvoMaster empowers agents to iteratively refine hypotheses, self-critique, and progressively accumulate knowledge across experimental cycles, faithfully mirroring human scientific inquiry. Crucially, as a domain-agnostic base harness, EvoMaster is exceptionally easy to scale up -- enabling developers to build and deploy highly capable, self-evolving scientific agents for arbitrary disciplines in approximately 100 lines of code. Built upon EvoMaster, we incubated the SciMaster ecosystem across domains such as machine learning, physics, and general science. Evaluations on four authoritative benchmarks (Humanity's Last Exam, MLE-Bench Lite, BrowseComp, and FrontierScience) demonstrate that EvoMaster achieves state-of-the-art scores of 41.1%, 75.8%, 73.3%, and 53.3%, respectively. It comprehensively outperforms the general-purpose baseline OpenClaw with relative improvements ranging from +159% to +316%, robustly validating its efficacy and generality as the premier foundational framework for the next generation of autonomous scientific discovery. EvoMaster is available at https://github.com/sjtu-sai-agents/EvoMaster.

EvoMaster：大規模に進化する自律的科学エージェントを構築するための基盤的エージェントフレームワーク

EvoMaster: A Foundational Agent Framework for Building Evolving Autonomous Scientific Agents at Scale

要旨

Support