教導智能體逐步繪製圖形部件
Teaching an Agent to Sketch One Part at a Time
March 19, 2026
作者: Xiaodan Du, Ruize Xu, David Yunis, Yael Vinker, Greg Shakhnarovich
cs.AI
摘要
我们开发了一种逐部分生成矢量草图的方法。通过监督微调后,采用新颖的多轮过程奖励强化学习策略,训练基于多模态语言模型的智能体。这一方法的实现得益于我们构建的ControlSketch-Part数据集,该数据集通过创新的通用自动标注流程,将矢量草图分割为语义部件,并采用结构化多阶段标注方式为部件分配路径,从而提供了丰富的部件级草图标注。实验结果表明,结合结构化部件级数据并通过过程可视化反馈,能够实现可解释、可控制且支持局部编辑的文本到矢量草图生成。
English
We develop a method for producing vector sketches one part at a time. To do this, we train a multi-modal language model-based agent using a novel multi-turn process-reward reinforcement learning following supervised fine-tuning. Our approach is enabled by a new dataset we call ControlSketch-Part, containing rich part-level annotations for sketches, obtained using a novel, generic automatic annotation pipeline that segments vector sketches into semantic parts and assigns paths to parts with a structured multi-stage labeling process. Our results indicate that incorporating structured part-level data and providing agent with the visual feedback through the process enables interpretable, controllable, and locally editable text-to-vector sketch generation.