DeepScientist: Avanzando en Descubrimientos Científicos de Vanguardia de Manera Progresiva

Resumen

Si bien los sistemas anteriores de Científico de IA pueden generar hallazgos novedosos, a menudo carecen del enfoque necesario para producir contribuciones científicamente valiosas que aborden desafíos urgentes definidos por humanos. Presentamos DeepScientist, un sistema diseñado para superar esta limitación mediante la realización de descubrimientos científicos completamente autónomos y orientados a objetivos en períodos de varios meses. Este sistema formaliza el descubrimiento como un problema de Optimización Bayesiana, operacionalizado a través de un proceso de evaluación jerárquico que consiste en "hipotetizar, verificar y analizar". Aprovechando una Memoria de Hallazgos acumulativa, este ciclo equilibra inteligentemente la exploración de nuevas hipótesis con la explotación, promoviendo selectivamente los hallazgos más prometedores a niveles de validación de mayor fidelidad. Consumiendo más de 20,000 horas de GPU, el sistema generó alrededor de 5,000 ideas científicas únicas y validó experimentalmente aproximadamente 1100 de ellas, superando finalmente los métodos de última generación (SOTA) diseñados por humanos en tres tareas de vanguardia de IA en un 183.7\%, 1.9\% y 7.9\%. Este trabajo proporciona la primera evidencia a gran escala de que una IA logra descubrimientos que progresivamente superan el SOTA humano en tareas científicas, produciendo hallazgos valiosos que realmente impulsan la frontera del descubrimiento científico. Para facilitar futuras investigaciones sobre este proceso, publicaremos todos los registros experimentales y el código del sistema en https://github.com/ResearAI/DeepScientist/.

English

While previous AI Scientist systems can generate novel findings, they often lack the focus to produce scientifically valuable contributions that address pressing human-defined challenges. We introduce DeepScientist, a system designed to overcome this by conducting goal-oriented, fully autonomous scientific discovery over month-long timelines. It formalizes discovery as a Bayesian Optimization problem, operationalized through a hierarchical evaluation process consisting of "hypothesize, verify, and analyze". Leveraging a cumulative Findings Memory, this loop intelligently balances the exploration of novel hypotheses with exploitation, selectively promoting the most promising findings to higher-fidelity levels of validation. Consuming over 20,000 GPU hours, the system generated about 5,000 unique scientific ideas and experimentally validated approximately 1100 of them, ultimately surpassing human-designed state-of-the-art (SOTA) methods on three frontier AI tasks by 183.7\%, 1.9\%, and 7.9\%. This work provides the first large-scale evidence of an AI achieving discoveries that progressively surpass human SOTA on scientific tasks, producing valuable findings that genuinely push the frontier of scientific discovery. To facilitate further research into this process, we will open-source all experimental logs and system code at https://github.com/ResearAI/DeepScientist/.

DeepScientist: Avanzando en Descubrimientos Científicos de Vanguardia de Manera Progresiva

DeepScientist: Advancing Frontier-Pushing Scientific Findings Progressively

Resumen

Support