Crafter:一個從多元輸入生成可編輯科學圖形的多智能體框架
Crafter: A Multi-Agent Harness for Editable Scientific Figure Generation from Diverse Inputs
May 28, 2026
作者: Haozhe Zhao, Shuzheng Si, Zhenhailong Wang, Zheng Wang, Liang Chen, Xiaotong Li, Zhixiang Liang, Maosong Sun, Minjia Zhang
cs.AI
摘要
科學圖表是傳達複雜研究概念最有效的方式之一,然而產出符合發表品質的圖表仍是論文準備過程中最耗費人力的環節。現有自動化系統各自針對單一圖表類型並僅接受文字輸入,無法應對研究人員實際使用的多樣圖表類型與條件;其輸出的點陣圖檔亦無法進行局部修改。由於科學圖表是由離散語意元件構成的結構化組合,生成器在這種佈局上產生的局部錯誤需要的並非更強大的主幹模型,而是約束機制。我們將此約束機制具體化為兩個互補系統:Crafter——一個能跨圖表類型與輸入條件進行泛化、無需改變架構的多智能體圖表生成框架;以及CraftEditor——運用相同模式將點陣輸出轉換為可編輯SVG格式的工具。此外,我們提出CraftBench基準測試,涵蓋三種圖表類型與四種輸入條件,並附有人工品質標註。實驗結果顯示,Crafter在PaperBanana-Bench及CraftBench上均大幅超越獨立生成器與基於智能體的基線方法,消融實驗證實各組件的獨立貢獻;CraftEditor則能將輸出忠實轉換為可編輯SVG,優於所有基線方法。我們的程式碼與基準測試已公開於 https://github.com/HaozheZhao/Crafter。
English
Scientific figures are among the most effective means of communicating complex research ideas, yet producing publication-quality illustrations remains one of the most labor-intensive parts of paper preparation. Existing automated systems each target a single figure type under text-only input, leaving the diversity of types and conditions researchers actually use unaddressed; their raster outputs further cannot be locally revised. Because scientific figures are structured compositions of discrete semantic components, the localized errors generators produce on such layouts demand not a stronger backbone but a harness. We instantiate this harness in two complementary systems: Crafter, a multi-agent harness for figure generation that generalizes across figure types and input conditions without architectural changes, and CraftEditor, which applies the same pattern to convert raster outputs into editable SVGs. Moreover, we introduce CraftBench, a benchmark spanning three figure types and four input conditions with human quality annotation. Experiments show that Crafter substantially outperforms both standalone generators and the agentic baseline on PaperBanana-Bench and CraftBench, with ablations confirming each component's independent contribution; CraftEditor faithfully converts outputs into editable SVGs that surpass all baselines. Our code and benchmark are available at https://github.com/HaozheZhao/Crafter.