AIDE:代碼空間中的AI驅動探索
AIDE: AI-Driven Exploration in the Space of Code
February 18, 2025
作者: Zhengyao Jiang, Dominik Schmidt, Dhruv Srikanth, Dixing Xu, Ian Kaplan, Deniss Jacenko, Yuxiang Wu
cs.AI
摘要
機器學習,作為現代人工智慧的基石,已推動了從根本上改變世界的創新。然而,在這些進步的背後,隱藏著一個複雜且往往繁瑣的過程,需要耗費大量人力和計算資源進行迭代與實驗。開發機器學習模型的工程師和科學家們,將大量時間花費在試錯任務上,而非構思創新解決方案或研究假設。為應對這一挑戰,我們引入了AI驅動探索(AIDE),這是一個由大型語言模型(LLMs)驅動的機器學習工程代理。AIDE將機器學習工程視為代碼優化問題,並將試錯過程形式化為在潛在解決方案空間中的樹搜索。通過策略性地重用和改進有前景的解決方案,AIDE有效地以計算資源換取性能提升,在多個機器學習工程基準測試中取得了最先進的成果,包括我們的Kaggle評估、OpenAI MLE-Bench和METRs RE-Bench。
English
Machine learning, the foundation of modern artificial intelligence, has
driven innovations that have fundamentally transformed the world. Yet, behind
advancements lies a complex and often tedious process requiring labor and
compute intensive iteration and experimentation. Engineers and scientists
developing machine learning models spend much of their time on trial-and-error
tasks instead of conceptualizing innovative solutions or research hypotheses.
To address this challenge, we introduce AI-Driven Exploration (AIDE), a machine
learning engineering agent powered by large language models (LLMs). AIDE frames
machine learning engineering as a code optimization problem, and formulates
trial-and-error as a tree search in the space of potential solutions. By
strategically reusing and refining promising solutions, AIDE effectively trades
computational resources for enhanced performance, achieving state-of-the-art
results on multiple machine learning engineering benchmarks, including our
Kaggle evaluations, OpenAI MLE-Bench and METRs RE-Bench.Summary
AI-Generated Summary