ChatPaper.aiChatPaper

GenEvolve:通过工具编排的视觉经验蒸馏实现自我演进的图像生成智能体

GenEvolve: Self-Evolving Image Generation Agents via Tool-Orchestrated Visual Experience Distillation

May 20, 2026
作者: Sixiang Chen, Zhaohu Xing, Tian Ye, Xinyu Geng, Yunlong Lin, Jianyu Lai, Xuanhua He, Fuxiang Zhai, Jialin Gao, Lei Zhu
cs.AI

摘要

开放图像生成不再是一个简单的提示词到图像的问题。高质量生成往往需要智能体将模型的内生生成能力与外部资源相结合。随着请求日益多样化和高要求,我们旨在开发一种通用的图像生成智能体,它能够通过轨迹自我进化,并在各种生成挑战中更有效地使用工具。为此,我们提出了GenEvolve——一个基于工具编排的视觉经验蒸馏的自进化框架。在GenEvolve中,每次生成尝试被建模为一条工具编排轨迹,智能体在此过程中收集证据、选择参考、调用生成技能,并将其组合成提示-参考程序。与现有主要依赖图像级标量奖励的智能体生成方法不同,GenEvolve针对同一请求比较多条轨迹,并将最佳与最差差异抽象为结构化视觉经验,仅提供给特权教师分支。受在线策略自蒸馏启发,视觉经验蒸馏提供了密集的令牌级监督,帮助学生智能体内化更优的搜索、知识激活、参考选择和提示构建。我们进一步构建了GenEvolve-Data和GenEvolve-Bench。在公开基准和GenEvolve-Bench上的实验表明,相对于强基线取得了显著改进,在当前图像生成框架中达到了最先进的性能。我们的网站如下:https://ephemeral182.github.io/GenEvolve/
English
Open-ended image generation is no longer a simple prompt-to-image problem. High-quality generation often requires an agent to combine a model's internal generative ability with external resources. As requests become more diverse and demanding, we aim to develop a general image-generation agent that can self-evolve through trajectories and use tools more effectively across varied generation challenges. To this end, we propose GenEvolve, a self-evolving framework based on Tool-Orchestrated Visual Experience Distillation. In GenEvolve, each generation attempt is modeled as a tool-orchestrated trajectory, where the agent gathers evidence, selects references, invokes generation skills, and composes them into a prompt-reference program. Unlike existing agentic generation methods that mainly rely on image-level scalar rewards, GenEvolve compares multiple trajectories for the same request and abstracts best-worst differences into structured visual experience, provided only to a privileged teacher branch. Inspired by on-policy self-distillation, Visual Experience Distillation provides dense token-level supervision, helping the student internalize better search, knowledge activation, reference selection, and prompt construction. We further construct GenEvolve-Data and GenEvolve-Bench. Experiments on public benchmarks and GenEvolve-Bench show substantial gains over strong baselines, achieving state-of-the-art performance among current image-generation frameworks. Our website is as follows: https://ephemeral182.github.io/GenEvolve/