SVGenius:大语言模型在SVG理解、编辑与生成领域的基准测试
SVGenius: Benchmarking LLMs in SVG Understanding, Editing and Generation
June 3, 2025
作者: Siqi Chen, Xinyu Dong, Haolei Xu, Xingyu Wu, Fei Tang, Hang Zhang, Yuchen Yan, Linjuan Wu, Wenqi Zhang, Guiyang Hou, Yongliang Shen, Weiming Lu, Yueting Zhuang
cs.AI
摘要
大型语言模型(LLMs)及多模态LLMs在SVG处理方面展现出显著潜力,然而现有基准测试存在现实场景覆盖不足、复杂度分层缺失以及评估范式碎片化等问题。我们推出了SVGenius,一个包含2,377个查询的综合性基准,涵盖理解、编辑与生成三个递进维度。基于24个应用领域的真实数据,并采用系统化的复杂度分层,SVGenius通过8个任务类别和18项指标对模型进行评估。我们对22个主流模型进行了全面测评,这些模型在规模、架构、训练范式及可访问性上各具特色。分析结果表明,尽管专有模型显著优于开源模型,但所有模型均随复杂度提升而表现出系统性性能下降,揭示了当前方法的基础性局限;然而,相较于单纯扩大规模,增强推理能力的训练在克服这些局限上更为有效,尽管风格迁移仍是所有模型类型中最具挑战性的能力。SVGenius首次为SVG处理建立了系统化的评估框架,为开发更强大的矢量图形模型及推动自动化图形设计应用提供了关键洞见。附录及补充材料(包含所有数据与代码)可通过https://zju-real.github.io/SVGenius获取。
English
Large Language Models (LLMs) and Multimodal LLMs have shown promising
capabilities for SVG processing, yet existing benchmarks suffer from limited
real-world coverage, lack of complexity stratification, and fragmented
evaluation paradigms. We introduce SVGenius, a comprehensive benchmark
comprising 2,377 queries across three progressive dimensions: understanding,
editing, and generation. Built on real-world data from 24 application domains
with systematic complexity stratification, SVGenius evaluates models through 8
task categories and 18 metrics. We assess 22 mainstream models spanning
different scales, architectures, training paradigms, and accessibility levels.
Our analysis reveals that while proprietary models significantly outperform
open-source counterparts, all models exhibit systematic performance degradation
with increasing complexity, indicating fundamental limitations in current
approaches; however, reasoning-enhanced training proves more effective than
pure scaling for overcoming these limitations, though style transfer remains
the most challenging capability across all model types. SVGenius establishes
the first systematic evaluation framework for SVG processing, providing crucial
insights for developing more capable vector graphics models and advancing
automated graphic design applications. Appendix and supplementary materials
(including all data and code) are available at
https://zju-real.github.io/SVGenius.