EvoMaster: Uma Estrutura de Agente Fundamental para Construir Agentes Científicos Autónomos em Evolução em Escala

Resumo

A convergência entre grandes modelos de linguagem e agentes está a catalisar uma nova era de descoberta científica: a Ciência Agêntica. Embora o método científico seja inerentemente iterativo, as estruturas de agentes existentes são predominantemente estáticas, de âmbito restrito e carecem da capacidade de aprender com a tentativa e erro. Para colmatar esta lacuna, apresentamos o EvoMaster, uma estrutura fundamental de agentes em evolução, concebida especificamente para Ciência Agêntica em Escala. Guiada pelo princípio central da auto-evolução contínua, o EvoMaster capacita os agentes para refinar iterativamente hipóteses, autocriticar-se e acumular progressivamente conhecimento ao longo de ciclos experimentais, espelhando fielmente a investigação científica humana. Crucialmente, enquanto base agnóstica de domínio, o EvoMaster é excecionalmente fácil de escalar — permitindo que os desenvolvedores construam e implementem agentes científicos altamente capacitados e auto-evolutivos para disciplinas arbitrárias em aproximadamente 100 linhas de código. Com base no EvoMaster, incubámos o ecossistema SciMaster em domínios como aprendizagem automática, física e ciência geral. As avaliações em quatro benchmarks autorizados (Humanity's Last Exam, MLE-Bench Lite, BrowseComp e FrontierScience) demonstram que o EvoMaster atinge pontuações de ponta de 41,1%, 75,8%, 73,3% e 53,3%, respetivamente. Supera comprehensiveamente a base de referência de propósito geral OpenClaw com melhorias relativas que variam de +159% a +316%, validando de forma robusta a sua eficácia e generalidade como a principal estrutura fundamental para a próxima geração de descoberta científica autónoma. O EvoMaster está disponível em https://github.com/sjtu-sai-agents/EvoMaster.

English

The convergence of large language models and agents is catalyzing a new era of scientific discovery: Agentic Science. While the scientific method is inherently iterative, existing agent frameworks are predominantly static, narrowly scoped, and lack the capacity to learn from trial and error. To bridge this gap, we present EvoMaster, a foundational evolving agent framework engineered specifically for Agentic Science at Scale. Driven by the core principle of continuous self-evolution, EvoMaster empowers agents to iteratively refine hypotheses, self-critique, and progressively accumulate knowledge across experimental cycles, faithfully mirroring human scientific inquiry. Crucially, as a domain-agnostic base harness, EvoMaster is exceptionally easy to scale up -- enabling developers to build and deploy highly capable, self-evolving scientific agents for arbitrary disciplines in approximately 100 lines of code. Built upon EvoMaster, we incubated the SciMaster ecosystem across domains such as machine learning, physics, and general science. Evaluations on four authoritative benchmarks (Humanity's Last Exam, MLE-Bench Lite, BrowseComp, and FrontierScience) demonstrate that EvoMaster achieves state-of-the-art scores of 41.1%, 75.8%, 73.3%, and 53.3%, respectively. It comprehensively outperforms the general-purpose baseline OpenClaw with relative improvements ranging from +159% to +316%, robustly validating its efficacy and generality as the premier foundational framework for the next generation of autonomous scientific discovery. EvoMaster is available at https://github.com/sjtu-sai-agents/EvoMaster.

EvoMaster: Uma Estrutura de Agente Fundamental para Construir Agentes Científicos Autónomos em Evolução em Escala

EvoMaster: A Foundational Agent Framework for Building Evolving Autonomous Scientific Agents at Scale

Resumo

Support