エージェンシックAIシステムは限界トークン配分主体として設計されるべきである

要旨

本ポジション・ペーパーは、エージェンシックAIシステムを、単体あたりの価格で評価されるテキスト生成器としてではなく、限界的トークン配分経済として設計・評価すべきであると論じる。我々は、単一のリクエスト（開発者がコーディングエージェントに失敗したテストの修正を依頼する場面）を、現在は個別に設計されている4つの経済レイヤーを通して追跡する：どのモデルが回答するかを決定するルーター、計画・実行・検証・延期のいずれを行うかを決定するエージェント、各トークンをどのように生成するかを決定するサービス提供スタック、そしてそのトレースが学習に値するかを決定するトレーニングパイプラインである。我々は、これら4つのレイヤーすべてが、異なるインデックス集合と異なる価格を用いながら、同一の一次条件（限界便益＝限界費用＋遅延コスト＋リスクコスト）を解いていることを示す。この枠組みは意図的に最小限に留めている：我々はAI経済学の完全な理論を提唱するものではない。しかし、限界的トークン配分を共通の会計対象として採用することにより、トークンを局所的に最小化するシステムが全球的にそれを誤配分する理由を説明し、繰り返し発生する少数の失敗モード（過剰ルーティング、過剰委任、検証不足、サービス提供の輻輳、陳腐化したロールアウト、キャッシュ誤用）を予測し、トークン認識評価、自律性価格設定、輻輳価格付けされたサービス提供、リスク調整済みRL予算配分といった具体的な研究アジェンダを示す。

English

This position paper argues that agentic AI systems should be designed and evaluated as marginal token allocation economies rather than as text generators priced by the unit. We follow a single request -- a developer asking a coding agent to fix a failing test -- through four economic layers that today are designed in isolation: a router that decides which model answers, an agent that decides whether to plan, act, verify, or defer, a serving stack that decides how to produce each token, and a training pipeline that decides whether the trace is worth learning from. We show that all four layers are solving the same first-order condition -- marginal benefit equals marginal cost plus latency cost plus risk cost -- with different index sets and different prices. The framing is deliberately minimal: we do not propose a complete theory of AI economics. But adopting marginal token allocation as the shared accounting object explains why systems that locally minimize tokens globally misallocate them, predicts a small set of recurring failure modes (over-routing, over-delegation, under-verification, serving congestion, stale rollouts, cache misuse), and points to a concrete research agenda in token-aware evaluation, autonomy pricing, congestion-priced serving, and risk-adjusted RL budgeting.

エージェンシックAIシステムは限界トークン配分主体として設計されるべきである

Agentic AI Systems Should Be Designed as Marginal Token Allocators

要旨

Support