ChatPaper.aiChatPaper

代理式人工智能系统应设计为边际代币分配器

Agentic AI Systems Should Be Designed as Marginal Token Allocators

May 2, 2026
作者: Siqi Zhu
cs.AI

摘要

本立场文件主张,具身人工智能系统应被设计和评估为边际令牌分配经济体系,而非按单位计价的文本生成器。我们以单个请求——开发者要求编程代理修复失败测试——为线索,贯穿剖析了当前各自独立设计的四个经济层级:决定由哪个模型应答的路由器、决定规划/执行/验证/委托的代理、决定令牌生成方式的服務堆栈,以及判断追踪记录是否值得学习的训练流程。我们证明这四个层级都在用不同的指标集和价格体系求解同一一阶条件——边际收益等于边际成本加延迟成本加风险成本。这一框架刻意保持极简:我们并非提出完整的人工智能经济学理论。但采用边际令牌分配作为共享核算对象,既能解释为何局部最小化令牌的系统会导致全局错配,又能预测一小类反复出现的故障模式(过度路由、过度委托、验证不足、服務拥堵、陈旧部署、缓存误用),并为令牌感知评估、自主性定价、拥堵定价服務、风险调整强化学习预算等方向指明具体的研究路径。
English
This position paper argues that agentic AI systems should be designed and evaluated as marginal token allocation economies rather than as text generators priced by the unit. We follow a single request -- a developer asking a coding agent to fix a failing test -- through four economic layers that today are designed in isolation: a router that decides which model answers, an agent that decides whether to plan, act, verify, or defer, a serving stack that decides how to produce each token, and a training pipeline that decides whether the trace is worth learning from. We show that all four layers are solving the same first-order condition -- marginal benefit equals marginal cost plus latency cost plus risk cost -- with different index sets and different prices. The framing is deliberately minimal: we do not propose a complete theory of AI economics. But adopting marginal token allocation as the shared accounting object explains why systems that locally minimize tokens globally misallocate them, predicts a small set of recurring failure modes (over-routing, over-delegation, under-verification, serving congestion, stale rollouts, cache misuse), and points to a concrete research agenda in token-aware evaluation, autonomy pricing, congestion-priced serving, and risk-adjusted RL budgeting.
PDF21May 6, 2026