网络智能体能力的结构化蒸馏赋能泛化性提升
Structured Distillation of Web Agent Capabilities Enables Generalization
April 9, 2026
作者: Xing Han Lù, Siva Reddy
cs.AI
摘要
前沿大语言模型虽能驾驭复杂网站,但其高昂成本与第三方API依赖导致本地部署难以实现。我们提出"智能体即标注者"框架,通过类比人类标注角色(将任务设计师、标注员和监督员替换为模块化LLM组件),系统化构建网络智能体的合成轨迹生成流程。以Gemini 3 Pro作为教师模型,我们在六种网络环境中生成3,000条操作轨迹,并对通过质量筛选的2,322条轨迹采用纯监督学习微调90亿参数学生模型。最终模型在WebArena基准测试中达到41.5%的成功率,超越闭源模型Claude 3.5 Sonnet(36.0%)和GPT-4o(31.5%),并将此前最佳开源权重结果(Go-Browse的21.7%)提升近一倍。该能力可迁移至未训练场景:在训练中从未接触的企业平台WorkArena L1上获得18.2个百分点的提升,并在另外三项基准测试中持续改进。消融实验证实各流程组件均贡献显著——评判过滤、评估提示和推理轨迹分别带来可量化的性能增益。这些结果表明,仅需基于单一前沿教师模型的结构化轨迹合成,即可培育出具有竞争力且可本地部署的网络智能体。项目页面:https://agent-as-annotators.github.io
English
Frontier LLMs can navigate complex websites, but their cost and reliance on third-party APIs make local deployment impractical. We introduce Agent-as-Annotators, a framework that structures synthetic trajectory generation for web agents by analogy to human annotation roles, replacing the Task Designer, Annotator, and Supervisor with modular LLM components. Using Gemini 3 Pro as teacher, we generate 3,000 trajectories across six web environments and fine-tune a 9B-parameter student with pure supervised learning on the 2,322 that pass quality filtering. The resulting model achieves 41.5% on WebArena, surpassing closed-source models such as Claude 3.5 Sonnet (36.0%) and GPT-4o (31.5%) under the same evaluation protocol, and nearly doubling the previous best open-weight result (Go-Browse, 21.7%). Capabilities transfer to unseen environments, with an 18.2 percentage point gain on WorkArena L1 (an enterprise platform never seen during training) and consistent improvements across three additional benchmarks. Ablations confirm that each pipeline component contributes meaningfully, with Judge filtering, evaluation hints, and reasoning traces each accounting for measurable gains. These results demonstrate that structured trajectory synthesis from a single frontier teacher is sufficient to produce competitive, locally deployable web agents. Project page: https://agent-as-annotators.github.io