ComfyMind:基於樹狀規劃與反應式反饋的通用生成框架
ComfyMind: Toward General-Purpose Generation via Tree-Based Planning and Reactive Feedback
May 23, 2025
作者: Litao Guo, Xinli Xu, Luozhou Wang, Jiantao Lin, Jinsong Zhou, Zixin Zhang, Bolan Su, Ying-Cong Chen
cs.AI
摘要
随着生成模型的飞速发展,通用生成作为一种统一多模态任务于单一系统的有前景方法,日益受到关注。尽管取得了这些进展,现有的开源框架往往仍显脆弱,在支持复杂现实世界应用方面面临挑战,主要归因于缺乏结构化的工作流规划及执行层面的反馈机制。为应对这些局限,我们推出了ComfyMind,一个基于ComfyUI平台构建的协作式AI系统,旨在实现稳健且可扩展的通用生成。ComfyMind引入了两大核心创新:语义工作流接口(SWI),将底层节点图抽象为自然语言描述的可调用功能模块,促进高层级组合并减少结构错误;以及带有局部反馈执行的搜索树规划机制,将生成过程建模为层次化决策流程,允许在每一阶段进行自适应修正。这些组件共同提升了复杂生成工作流的稳定性与灵活性。我们在三个公开基准测试上评估了ComfyMind:ComfyBench、GenEval和Reason-Edit,涵盖生成、编辑和推理任务。结果显示,ComfyMind在各项任务中均优于现有开源基线,并达到了与GPT-Image-1相媲美的性能。ComfyMind为开源通用生成AI系统的发展开辟了一条充满希望的道路。项目页面:https://github.com/LitaoGuo/ComfyMind
English
With the rapid advancement of generative models, general-purpose generation
has gained increasing attention as a promising approach to unify diverse tasks
across modalities within a single system. Despite this progress, existing
open-source frameworks often remain fragile and struggle to support complex
real-world applications due to the lack of structured workflow planning and
execution-level feedback. To address these limitations, we present ComfyMind, a
collaborative AI system designed to enable robust and scalable general-purpose
generation, built on the ComfyUI platform. ComfyMind introduces two core
innovations: Semantic Workflow Interface (SWI) that abstracts low-level node
graphs into callable functional modules described in natural language, enabling
high-level composition and reducing structural errors; Search Tree Planning
mechanism with localized feedback execution, which models generation as a
hierarchical decision process and allows adaptive correction at each stage.
Together, these components improve the stability and flexibility of complex
generative workflows. We evaluate ComfyMind on three public benchmarks:
ComfyBench, GenEval, and Reason-Edit, which span generation, editing, and
reasoning tasks. Results show that ComfyMind consistently outperforms existing
open-source baselines and achieves performance comparable to GPT-Image-1.
ComfyMind paves a promising path for the development of open-source
general-purpose generative AI systems. Project page:
https://github.com/LitaoGuo/ComfyMindSummary
AI-Generated Summary