安全可扩展的Web智能体学习：基于网站重构的新范式

摘要

训练自主网络智能体的根本局限在于其学习环境：真实网站存在探索风险、难以重置且鲜少提供可验证反馈。我们提出VeriEnv框架，将语言模型作为环境生成器，自动将真实网站克隆为完全可执行、可验证的合成环境。通过Python SDK开放受控内部访问权限，VeriEnv使智能体能够自主生成任务并获得可编程验证的确定性奖励，摆脱对启发式或LLM评判器的依赖。该设计将智能体学习与不安全的真实交互解耦，同时通过环境扩展实现可扩展的自我进化。在网络智能体基准测试中，使用VeriEnv训练的智能体能够泛化至未见过的网站，通过自我进化训练实现站点专属精熟，并受益于训练环境数量的扩展。代码与资源将在论文录用后发布于https://github.com/kyle8581/VeriEnv。

English

Training autonomous web agents is fundamentally limited by the environments they learn from: real-world websites are unsafe to explore, hard to reset, and rarely provide verifiable feedback. We propose VeriEnv, a framework that treats language models as environment creators, automatically cloning real-world websites into fully executable, verifiable synthetic environments. By exposing controlled internal access via a Python SDK, VeriEnv enables agents to self-generate tasks with deterministic, programmatically verifiable rewards, eliminating reliance on heuristic or LLM-based judges. This design decouples agent learning from unsafe real-world interaction while enabling scalable self-evolution through environment expansion. Through experiments on web agent benchmarks, we show that agents trained with VeriEnv generalize to unseen websites, achieve site-specific mastery through self-evolving training, and benefit from scaling the number of training environments. Code and resources will be released at https://github.com/kyle8581/VeriEnv upon acceptance.