雲爵智能體技術報告:面向開放式任務的全可複現、零啟動原位自進化智能體系統
Yunjue Agent Tech Report: A Fully Reproducible, Zero-Start In-Situ Self-Evolving Agent System for Open-Ended Tasks
January 26, 2026
作者: Haotian Li, Shijun Yang, Weizhen Qi, Silei Zhao, Rui Hua, Mingzhu Song, Xiaojian Yang, Chao Peng
cs.AI
摘要
傳統的智能體系統在開放式環境中常面臨挑戰,這類環境的任務分佈持續漂移且外部監督稀缺。系統對靜態工具集或離線訓練的依賴難以適應動態變化,導致其能力邊界既固化又不明確。為此,我們提出「原位自我演化」範式。該方法將連續的任務互動視作經驗流,使系統能將短期執行反饋提煉為可長期重用的能力,且無需依賴真實標籤。在此框架下,我們將工具演化確認為能力擴展的關鍵路徑——其可提供可驗證的二值反饋信號。基於此框架,我們開發了雲鷺智能體系統,通過迭代式工具合成、優化與複用來應對新興挑戰。為提升演化效率,我們進一步提出並行批次演化策略。在零起始設定的五個多樣化基準測試中,實證評估顯示其性能顯著超越專有基線模型。此外,補充性的暖起始實驗證實,系統累積的通用知識可無縫遷移至新領域。最後,我們提出一種監控演化收斂的新指標,其功能類似傳統優化中的訓練損失。我們開源了代碼庫、系統軌跡及演化工具,以推動韌性自演化智能研究的發展。
English
Conventional agent systems often struggle in open-ended environments where task distributions continuously drift and external supervision is scarce. Their reliance on static toolsets or offline training lags behind these dynamics, leaving the system's capability boundaries rigid and unknown. To address this, we propose the In-Situ Self-Evolving paradigm. This approach treats sequential task interactions as a continuous stream of experience, enabling the system to distill short-term execution feedback into long-term, reusable capabilities without access to ground-truth labels. Within this framework, we identify tool evolution as the critical pathway for capability expansion, which provides verifiable, binary feedback signals. Within this framework, we develop Yunjue Agent, a system that iteratively synthesizes, optimizes, and reuses tools to navigate emerging challenges. To optimize evolutionary efficiency, we further introduce a Parallel Batch Evolution strategy. Empirical evaluations across five diverse benchmarks under a zero-start setting demonstrate significant performance gains over proprietary baselines. Additionally, complementary warm-start evaluations confirm that the accumulated general knowledge can be seamlessly transferred to novel domains. Finally, we propose a novel metric to monitor evolution convergence, serving as a function analogous to training loss in conventional optimization. We open-source our codebase, system traces, and evolved tools to facilitate future research in resilient, self-evolving intelligence.