云珏智能体技术报告:面向开放任务的完全可复现、零起点原位自进化智能体系统
Yunjue Agent Tech Report: A Fully Reproducible, Zero-Start In-Situ Self-Evolving Agent System for Open-Ended Tasks
January 26, 2026
作者: Haotian Li, Shijun Yang, Weizhen Qi, Silei Zhao, Rui Hua, Mingzhu Song, Xiaojian Yang, Chao Peng
cs.AI
摘要
传统智能体系统在开放环境中常表现不佳,这类环境的任务分布持续漂移且外部监督稀缺。系统对静态工具集或离线训练的依赖难以适应动态变化,导致能力边界僵化且不可知。为此,我们提出原位自进化范式。该方法将序列化任务交互视为连续经验流,使系统能在缺乏真值标签的情况下,将短期执行反馈提炼为长期可复用的能力。在此框架下,我们将工具进化确定为能力扩展的关键路径——其可提供可验证的二元反馈信号。基于此,我们开发了云雀智能体系统,通过迭代式工具合成、优化与复用来应对新兴挑战。为提升进化效率,我们进一步提出并行批量进化策略。在零起点设置下对五个异构基准的实证评估表明,该系统较私有基线实现显著性能提升。补充性的热启动实验也证实,系统积累的通用知识可无缝迁移至新领域。最后,我们提出监测进化收敛性的新指标,其功能类似于传统优化中的训练损失函数。我们开源了代码库、系统轨迹及进化工具,以推动韧性自进化智能体的研究发展。
English
Conventional agent systems often struggle in open-ended environments where task distributions continuously drift and external supervision is scarce. Their reliance on static toolsets or offline training lags behind these dynamics, leaving the system's capability boundaries rigid and unknown. To address this, we propose the In-Situ Self-Evolving paradigm. This approach treats sequential task interactions as a continuous stream of experience, enabling the system to distill short-term execution feedback into long-term, reusable capabilities without access to ground-truth labels. Within this framework, we identify tool evolution as the critical pathway for capability expansion, which provides verifiable, binary feedback signals. Within this framework, we develop Yunjue Agent, a system that iteratively synthesizes, optimizes, and reuses tools to navigate emerging challenges. To optimize evolutionary efficiency, we further introduce a Parallel Batch Evolution strategy. Empirical evaluations across five diverse benchmarks under a zero-start setting demonstrate significant performance gains over proprietary baselines. Additionally, complementary warm-start evaluations confirm that the accumulated general knowledge can be seamlessly transferred to novel domains. Finally, we propose a novel metric to monitor evolution convergence, serving as a function analogous to training loss in conventional optimization. We open-source our codebase, system traces, and evolved tools to facilitate future research in resilient, self-evolving intelligence.