SVGenius: LLMのSVG理解、編集、生成におけるベンチマーキング

要旨

大規模言語モデル（LLMs）およびマルチモーダルLLMsは、SVG処理において有望な能力を示しているが、既存のベンチマークは現実世界のカバレッジが限定的で、複雑さの階層化が不十分であり、評価パラダイムが断片的であるという課題を抱えている。本論文では、理解、編集、生成という3つの進展的な次元にわたる2,377のクエリから成る包括的なベンチマーク「SVGenius」を紹介する。24のアプリケーションドメインからの実世界データに基づき、体系的な複雑さの階層化を行ったSVGeniusは、8つのタスクカテゴリと18のメトリクスを通じてモデルを評価する。異なるスケール、アーキテクチャ、トレーニングパラダイム、アクセスレベルにわたる22の主流モデルを評価した結果、プロプライエタリモデルがオープンソースモデルを大幅に上回る一方で、すべてのモデルが複雑さの増加に伴い体系的な性能低下を示し、現在のアプローチにおける根本的な限界が明らかになった。ただし、これらの限界を克服するためには、純粋なスケーリングよりも推論を強化したトレーニングがより効果的であることが示されたものの、スタイル転送はすべてのモデルタイプにおいて最も困難な能力であることが判明した。SVGeniusは、SVG処理における初の体系的な評価フレームワークを確立し、より優れたベクターグラフィックスモデルの開発と自動化されたグラフィックデザインアプリケーションの進展に向けた重要な洞察を提供する。付録および補足資料（すべてのデータとコードを含む）はhttps://zju-real.github.io/SVGeniusで公開されている。

English

Large Language Models (LLMs) and Multimodal LLMs have shown promising capabilities for SVG processing, yet existing benchmarks suffer from limited real-world coverage, lack of complexity stratification, and fragmented evaluation paradigms. We introduce SVGenius, a comprehensive benchmark comprising 2,377 queries across three progressive dimensions: understanding, editing, and generation. Built on real-world data from 24 application domains with systematic complexity stratification, SVGenius evaluates models through 8 task categories and 18 metrics. We assess 22 mainstream models spanning different scales, architectures, training paradigms, and accessibility levels. Our analysis reveals that while proprietary models significantly outperform open-source counterparts, all models exhibit systematic performance degradation with increasing complexity, indicating fundamental limitations in current approaches; however, reasoning-enhanced training proves more effective than pure scaling for overcoming these limitations, though style transfer remains the most challenging capability across all model types. SVGenius establishes the first systematic evaluation framework for SVG processing, providing crucial insights for developing more capable vector graphics models and advancing automated graphic design applications. Appendix and supplementary materials (including all data and code) are available at https://zju-real.github.io/SVGenius.

SVGenius: LLMのSVG理解、編集、生成におけるベンチマーキング

SVGenius: Benchmarking LLMs in SVG Understanding, Editing and Generation

要旨

Support