GenEvolve: ツール連携による視覚経験蒸留を用いた自己進化型画像生成エージェント
GenEvolve: Self-Evolving Image Generation Agents via Tool-Orchestrated Visual Experience Distillation
May 20, 2026
著者: Sixiang Chen, Zhaohu Xing, Tian Ye, Xinyu Geng, Yunlong Lin, Jianyu Lai, Xuanhua He, Fuxiang Zhai, Jialin Gao, Lei Zhu
cs.AI
要旨
开放式图像生成已不再是一个简单的提示词到图像的问题。高质量的生成通常需要代理将模型的内在生成能力与外部资源相结合。随着需求日益多样化和高要求,我们旨在开发一种通用的图像生成代理,该代理能够通过轨迹自我进化,并在各种生成挑战中更有效地使用工具。为此,我们提出GenEvolve,一种基于工具编排的视觉经验蒸馏的自我进化框架。在GenEvolve中,每次生成尝试都被建模为一条工具编排轨迹,代理在其中收集证据、选择参考、调用生成技能,并将它们组合成一个提示词-参考程序。与主要依赖图像级标量奖励的现有代理生成方法不同,GenEvolve对同一请求的多条轨迹进行比较,并将最优与最差轨迹的差异抽象为结构化视觉经验,仅提供给特权教师分支。受在线策略自蒸馏的启发,视觉经验蒸馏提供了密集的令牌级监督,帮助学生内化更好的搜索、知识激活、参考选择和提示词构建。我们进一步构建了GenEvolve-Data和GenEvolve-Bench。在公开基准和GenEvolve-Bench上的实验表明,相较于强基线有显著提升,并在当前图像生成框架中达到了最先进的性能。我们的网站如下:https://ephemeral182.github.io/GenEvolve/
English
Open-ended image generation is no longer a simple prompt-to-image problem. High-quality generation often requires an agent to combine a model's internal generative ability with external resources. As requests become more diverse and demanding, we aim to develop a general image-generation agent that can self-evolve through trajectories and use tools more effectively across varied generation challenges. To this end, we propose GenEvolve, a self-evolving framework based on Tool-Orchestrated Visual Experience Distillation. In GenEvolve, each generation attempt is modeled as a tool-orchestrated trajectory, where the agent gathers evidence, selects references, invokes generation skills, and composes them into a prompt-reference program. Unlike existing agentic generation methods that mainly rely on image-level scalar rewards, GenEvolve compares multiple trajectories for the same request and abstracts best-worst differences into structured visual experience, provided only to a privileged teacher branch. Inspired by on-policy self-distillation, Visual Experience Distillation provides dense token-level supervision, helping the student internalize better search, knowledge activation, reference selection, and prompt construction. We further construct GenEvolve-Data and GenEvolve-Bench. Experiments on public benchmarks and GenEvolve-Bench show substantial gains over strong baselines, achieving state-of-the-art performance among current image-generation frameworks. Our website is as follows: https://ephemeral182.github.io/GenEvolve/