GenEvolve:透過工具編排的視覺經驗蒸餾實現自演化圖像生成智能體
GenEvolve: Self-Evolving Image Generation Agents via Tool-Orchestrated Visual Experience Distillation
May 20, 2026
作者: Sixiang Chen, Zhaohu Xing, Tian Ye, Xinyu Geng, Yunlong Lin, Jianyu Lai, Xuanhua He, Fuxiang Zhai, Jialin Gao, Lei Zhu
cs.AI
摘要
開放式影像生成已不再是一個單純的提示詞到影像的問題。高品質生成往往需要一個智能體結合模型的內在生成能力與外部資源。隨著需求變得更多樣且更具挑戰性,我們致力於開發一個通用的影像生成智能體,該智能體能夠透過軌跡自我演化,並在不同生成挑戰中更有效地運用工具。為此,我們提出GenEvolve,一個基於工具編排視覺經驗蒸餾的自我演化框架。在GenEvolve中,每次生成嘗試都被建模成一條工具編排的軌跡,智能體從中收集證據、選擇參考、調用生成技巧,並將它們組合成一個提示-參考程式。與現有主要依賴影像層級標量獎勵的智能體生成方法不同,GenEvolve針對同一請求比較多條軌跡,並將最佳與最差之間的差異抽象為結構化視覺經驗,僅提供給特權教師分支。受線上策略自我蒸餾啟發,視覺經驗蒸餾提供密集的標記層級監督,幫助學生內部化更好的搜尋、知識啟動、參考選擇及提示建構。我們進一步構建了GenEvolve-Data與GenEvolve-Bench。在公開基準與GenEvolve-Bench上的實驗顯示,相較於強基線方法有顯著提升,達成了當前影像生成框架中的最佳性能。我們的網站如下:https://ephemeral182.github.io/GenEvolve/
English
Open-ended image generation is no longer a simple prompt-to-image problem. High-quality generation often requires an agent to combine a model's internal generative ability with external resources. As requests become more diverse and demanding, we aim to develop a general image-generation agent that can self-evolve through trajectories and use tools more effectively across varied generation challenges. To this end, we propose GenEvolve, a self-evolving framework based on Tool-Orchestrated Visual Experience Distillation. In GenEvolve, each generation attempt is modeled as a tool-orchestrated trajectory, where the agent gathers evidence, selects references, invokes generation skills, and composes them into a prompt-reference program. Unlike existing agentic generation methods that mainly rely on image-level scalar rewards, GenEvolve compares multiple trajectories for the same request and abstracts best-worst differences into structured visual experience, provided only to a privileged teacher branch. Inspired by on-policy self-distillation, Visual Experience Distillation provides dense token-level supervision, helping the student internalize better search, knowledge activation, reference selection, and prompt construction. We further construct GenEvolve-Data and GenEvolve-Bench. Experiments on public benchmarks and GenEvolve-Bench show substantial gains over strong baselines, achieving state-of-the-art performance among current image-generation frameworks. Our website is as follows: https://ephemeral182.github.io/GenEvolve/