Chat2Workflow:基於自然語言生成可執行視覺化工作流程的基準框架
Chat2Workflow: A Benchmark for Generating Executable Visual Workflows with Natural Language
April 21, 2026
作者: Yi Zhong, Buqiang Xu, Yijun Wang, Zifei Shan, Shuofei Qiao, Guozhou Zheng, Ningyu Zhang
cs.AI
摘要
目前,可執行的視覺化工作流程已成為工業實際部署的主流範式,具有高可靠性和可控性優勢。然而在現行實踐中,這類工作流程幾乎完全依賴人工構建:開發者需要精心設計流程架構,為每個環節編寫提示詞,並隨需求變更反覆調整邏輯——這導致開發成本高昂、耗時冗長且易出錯。為探究大語言模型能否自動化這類多輪交互過程,我們提出Chat2Workflow基準測試集,用於從自然語言直接生成可執行的視覺化工作流程,並設計了穩健的智能體框架以緩解重複性執行錯誤。Chat2Workflow基於大量真實業務流程構建,每個實例的生成結果均可轉換並直接部署至Dify、Coze等實際工作流平台。實驗結果表明,儘管現有頂尖語言模型能較好捕捉高層意圖,但在生成正確、穩定且可執行的工作流程方面仍存在困難,尤其在處理複雜或動態需求時更為明顯。雖然我們的智能體框架將解決率最高提升了5.34%,但殘留的現實差距使Chat2Workflow成為推進工業級自動化的重要基礎平台。程式碼已開源於:https://github.com/zjunlp/Chat2Workflow。
English
At present, executable visual workflows have emerged as a mainstream paradigm in real-world industrial deployments, offering strong reliability and controllability. However, in current practice, such workflows are almost entirely constructed through manual engineering: developers must carefully design workflows, write prompts for each step, and repeatedly revise the logic as requirements evolve-making development costly, time-consuming, and error-prone. To study whether large language models can automate this multi-round interaction process, we introduce Chat2Workflow, a benchmark for generating executable visual workflows directly from natural language, and propose a robust agentic framework to mitigate recurrent execution errors. Chat2Workflow is built from a large collection of real-world business workflows, with each instance designed so that the generated workflow can be transformed and directly deployed to practical workflow platforms such as Dify and Coze. Experimental results show that while state-of-the-art language models can often capture high-level intent, they struggle to generate correct, stable, and executable workflows, especially under complex or changing requirements. Although our agentic framework yields up to 5.34% resolve rate gains, the remaining real-world gap positions Chat2Workflow as a foundation for advancing industrial-grade automation. Code is available at https://github.com/zjunlp/Chat2Workflow.