ChatPaper.aiChatPaper

OpenSkill:大語言模型智能體的開放世界自我進化

OpenSkill: Open-World Self-Evolution for LLM Agents

June 4, 2026
作者: Zhiling Yan, Dingjie Song, Hanrong Zhang, Wei Liang, Yuxuan Zhang, Yutong Dai, Lifang He, Philip S. Yu, Ran Xu, Xiang Li, Lichao Sun
cs.AI

摘要

自我演化智能體需要在部署後進行適應,但現有方法假設存在可用的學習循環,例如精心挑選的技能、成功軌跡或驗證信號。真實的開放世界部署可能不具備這些條件,僅提供一個任務提示。在本研究中,我們探索開放世界自我演化——智能體必須從零開始建構其技能與自身的驗證信號,僅依賴開放世界的資源,而無目標任務的監督。我們提出 OpenSkill 框架,用以啟動此循環:它從文件、程式碼倉庫與網絡中獲取紮實的知識與驗證錨點,將其綜合為可遷移的技能,然後根據這些錨點(而非目標答案)自建虛擬任務,並在該任務中對技能進行精煉。如此一來,開放世界既提供待學習的知識,也提供獨立於監督的練習環境,而目標任務的監督則保留給最終評估。在三項基準測試與兩個目標智能體上,OpenSkill 在滿足無監督限制的同時,達成了最佳自動通過率。分析顯示,其技能可在不同模型間遷移,無需針對特定模型進行調整;且其自建驗證器在從未存取真實結果的情況下,仍能與真實結果保持一致。
English
Self-evolving agents requires adaptation after deployment, but existing approaches assume a usable learning loop, such as curated skills, successful trajectories, or verifier signals. Real open-world deployments may provide none of these, offering only a task prompt. In this work, we study open-world self-evolution, where an agent must build both its skills and its own verification signals from scratch, using open-world resources but no target-task supervision. We propose OpenSkill, a framework that bootstraps this loop: it acquires grounded knowledge and verification anchors from documentation, repositories, and the web, synthesizes them into transferable skills, and refines those skills against self-built virtual tasks grounded in the anchors rather than in target answers. The open world thus supplies both the knowledge to be learned and a supervision-independent practice environment, with target-task supervision reserved for final evaluation. Across three benchmarks and two target agents, OpenSkill attains the best automated pass rate while satisfying the no-supervision constraint. Analysis shows its skills transfer across models without model-specific adaptation, and its self-built verifier aligns with ground-truth outcomes despite never accessing them.