静的テンプレートから動的ランタイムグラフへ：LLMエージェントのワークフロー最適化に関するサーベイ

要旨

大規模言語モデル（LLM）ベースのシステムは、LLM呼び出し、情報検索、ツール利用、コード実行、メモリ更新、検証を組み合わせた実行可能なワークフローを構築することで課題を解決する手法として、ますます一般的になりつつある。本サーベイでは、このようなワークフローをエージェント的計算グラフ（ACGs）として捉え、その設計と最適化に関する最近の手法を概観する。我々は、ワークフロー構造が決定されるタイミングに基づいて文献を整理する。ここで構造とは、どのコンポーネントやエージェントが存在するか、それらが互いにどのように依存するか、および情報がそれらの間をどのように流れるかを指す。この視点により、展開前に再利用可能なワークフローの骨格を固定する静的手法と、実行前または実行中に特定の実行のためにワークフローを選択、生成、または修正する動的手法とを区別する。我々はさらに、先行研究を3つの次元に沿って整理する：構造が決定されるタイミング、ワークフローのどの部分が最適化されるか、およびどの評価信号（タスク指標、検証器信号、選好、トレース由来のフィードバックなど）が最適化を導くか。また、再利用可能なワークフローテンプレート、実行ごとの具体化されたグラフ、実行トレースを区別し、再利用可能な設計選択と、特定の実行で実際に展開される構造、そして実現された実行時動作とを分離する。最後に、下流タスク指標をグラフレベルの特性、実行コスト、堅牢性、および入力間の構造的変動で補完する、構造を考慮した評価の視点を概説する。我々の目的は、明確な用語集、新規手法を位置づけるための統一フレームワーク、既存文献群に対するより比較可能な視点、そしてLLMエージェントのためのワークフロー最適化における将来の研究のためのより再現性の高い評価基準を提供することである。

English

Large language model (LLM)-based systems are becoming increasingly popular for solving tasks by constructing executable workflows that interleave LLM calls, information retrieval, tool use, code execution, memory updates, and verification. This survey reviews recent methods for designing and optimizing such workflows, which we treat as agentic computation graphs (ACGs). We organize the literature based on when workflow structure is determined, where structure refers to which components or agents are present, how they depend on each other, and how information flows between them. This lens distinguishes static methods, which fix a reusable workflow scaffold before deployment, from dynamic methods, which select, generate, or revise the workflow for a particular run before or during execution. We further organize prior work along three dimensions: when structure is determined, what part of the workflow is optimized, and which evaluation signals guide optimization (e.g., task metrics, verifier signals, preferences, or trace-derived feedback). We also distinguish reusable workflow templates, run-specific realized graphs, and execution traces, separating reusable design choices from the structures actually deployed in a given run and from realized runtime behavior. Finally, we outline a structure-aware evaluation perspective that complements downstream task metrics with graph-level properties, execution cost, robustness, and structural variation across inputs. Our goal is to provide a clear vocabulary, a unified framework for positioning new methods, a more comparable view of existing body of literature, and a more reproducible evaluation standard for future work in workflow optimizations for LLM agents.

静的テンプレートから動的ランタイムグラフへ：LLMエージェントのワークフロー最適化に関するサーベイ

From Static Templates to Dynamic Runtime Graphs: A Survey of Workflow Optimization for LLM Agents

要旨

Support