ChatPaper.aiChatPaper

PosterCopilot:面向专业平面设计的布局推理与可控编辑技术

PosterCopilot: Toward Layout Reasoning and Controllable Editing for Professional Graphic Design

December 3, 2025
作者: Jiazhe Wei, Ken Li, Tianyu Lao, Haofan Wang, Liang Wang, Caifeng Shan, Chenyang Si
cs.AI

摘要

平面设计作为现代视觉传达的基石,是推广文化商业活动的重要媒介。尽管当前研究已尝试利用大型多模态模型实现设计流程自动化,但现有方法常存在几何布局失准问题,且缺乏专业工作流所需的逐层迭代编辑能力。为此,我们提出PosterCopilot框架,通过增强布局推理与可控编辑功能推动专业平面设计智能化发展。具体而言,我们设计了渐进式三阶段训练策略:扰动监督微调、视觉现实对齐的强化学习、以及美学反馈强化学习,使大型多模态模型掌握几何感知与美学推理的布局设计能力。进一步构建完整工作流,将训练完成的设计模型与生成模型耦合,在保持全局视觉一致性的同时,实现图层可控的迭代式精细编辑。大量实验表明,PosterCopilot能生成几何精确且美学突出的布局,为专业迭代设计提供前所未有的可控性。
English
Graphic design forms the cornerstone of modern visual communication, serving as a vital medium for promoting cultural and commercial events. Recent advances have explored automating this process using Large Multimodal Models (LMMs), yet existing methods often produce geometrically inaccurate layouts and lack the iterative, layer-specific editing required in professional workflows. To address these limitations, we present PosterCopilot, a framework that advances layout reasoning and controllable editing for professional graphic design. Specifically, we introduce a progressive three-stage training strategy that equips LMMs with geometric understanding and aesthetic reasoning for layout design, consisting of Perturbed Supervised Fine-Tuning, Reinforcement Learning for Visual-Reality Alignment, and Reinforcement Learning from Aesthetic Feedback. Furthermore, we develop a complete workflow that couples the trained LMM-based design model with generative models, enabling layer-controllable, iterative editing for precise element refinement while maintaining global visual consistency. Extensive experiments demonstrate that PosterCopilot achieves geometrically accurate and aesthetically superior layouts, offering unprecedented controllability for professional iterative design.
PDF01December 5, 2025