MLEvolve:一种自我进化的自动化机器学习算法发现框架
MLEvolve: A Self-Evolving Framework for Automated Machine Learning Algorithm Discovery
June 4, 2026
作者: Shangheng Du, Xiangchao Yan, Jinxin Shi, Zongsheng Cao, Shiyang Feng, Zichen Liang, Boyuan Sun, Tianshuo Peng, Yifan Zhou, Xin Li, Jie Zhou, Liang He, Bo Zhang, Lei Bai
cs.AI
摘要
大型语言模型(LLM)智能体被越来越多地应用于科学发现和机器学习工程(MLE)等长周期任务,在这些任务中,持续自我进化成为关键能力。然而,现有MLE智能体存在分支间信息隔离、无记忆搜索以及缺乏层次化控制等问题,这些问题共同阻碍了长周期优化。我们提出MLEvolve——一个基于LLM的自进化多智能体框架,用于端到端的机器学习算法发现。通过将树搜索扩展为渐进式MCGS,MLEvolve借助基于图的参考边实现跨分支信息流动,并利用熵启发的渐进式调度,使搜索逐步从广泛探索过渡到聚焦利用。为了让智能体随着经验积累而进化,我们引入了回顾性记忆,该机制将冷启动领域知识库与用于任务特定经验检索和复用的动态全局记忆相结合。为实现稳定的长周期迭代,我们进一步将战略规划与代码生成解耦,并采用自适应编码模式。在MLE-Bench上的评估表明,MLEvolve在多个维度上取得了最先进的性能,包括在12小时预算(标准运行时的一半)下的平均奖牌率和有效提交率。此外,MLEvolve在数学算法优化任务上还超越了包括AlphaEvolve在内的专门算法发现方法,展现出强大的跨域泛化能力。我们的代码已在https://github.com/InternScience/MLEvolve公开。
English
Large language model (LLM) agents are increasingly applied to long-horizon tasks such as scientific discovery and machine learning engineering (MLE), where sustained self-evolution becomes a key capability. However, existing MLE agents suffer from inter-branch information isolation, memoryless search, and lack of hierarchical control, which together hinder long-horizon optimization. We present MLEvolve, an LLM-based self-evolving multi-agent framework for end-to-end machine learning algorithm discovery. By extending tree search to Progressive MCGS, MLEvolve enables cross-branch information flow through graph-based reference edges and gradually shifts the search from broad exploration to focused exploitation with an entropy-inspired progressive schedule. To allow the agent to evolve with accumulated experience, we introduce Retrospective Memory, which combines a cold-start domain knowledge base with a dynamic global memory for task-specific experience retrieval and reuse. For stable long-horizon iteration, we further decouple strategic planning from code generation with adaptive coding modes. Evaluation on MLE-Bench shows that MLEvolve achieves state-of-the-art performance across multiple dimensions including average medal rate and valid submission rate under a 12-hour budget (half the standard runtime). Moreover, MLEvolve also outperforms specialized algorithm discovery methods including AlphaEvolve on mathematical algorithm optimization tasks, demonstrating strong cross-domain generalization. Our code is available at https://github.com/InternScience/MLEvolve.