UI-Venus-1.5技术报告
UI-Venus-1.5 Technical Report
February 9, 2026
作者: Veuns-Team, Changlong Gao, Zhangxuan Gu, Yulin Liu, Xinyu Qiu, Shuheng Shen, Yue Wen, Tianyu Xia, Zhenyu Xu, Zhengwen Zeng, Beitong Zhou, Xingran Zhou, Weizhi Chen, Sunhao Dai, Jingya Dou, Yichen Gong, Yuan Guo, Zhenlin Guo, Feng Li, Qian Li, Jinzhen Lin, Yuqi Zhou, Linchao Zhu, Liang Chen, Zhenyu Guo, Changhua Meng, Weiqiang Wang
cs.AI
摘要
图形用户界面智能体已成为自动化数字环境交互的强大范式,但实现广泛通用性与持续强劲的任务性能仍具挑战。本报告提出UI-Venus-1.5——一个面向鲁棒实际应用的端到端统一GUI智能体。该模型系列包含两个稠密版本(2B/8B)和一个专家混合版本(30B-A3B),以适应多样化下游应用场景。相较于前代版本,UI-Venus-1.5引入三大关键技术突破:(1)基于30余个数据集、100亿token的中期训练阶段,建立GUI语义基础;(2)采用全轨迹展开的在线强化学习,使训练目标与大规模环境中长周期动态导航相契合;(3)通过模型融合构建统一GUI智能体,将领域专用模型(基础操作、网页端、移动端)整合为统一检查点。大量实验表明,UI-Venus-1.5在ScreenSpot-Pro(69.6%)、VenusBench-GD(75.0%)和AndroidWorld(77.6%)等基准测试中创下性能新纪录,显著超越现有强基线模型。此外,该智能体在各类中文移动应用中展现出鲁棒的导航能力,能有效执行真实场景下的用户指令。代码:https://github.com/inclusionAI/UI-Venus;模型:https://huggingface.co/collections/inclusionAI/ui-venus
English
GUI agents have emerged as a powerful paradigm for automating interactions in digital environments, yet achieving both broad generality and consistently strong task performance remains challenging.In this report, we present UI-Venus-1.5, a unified, end-to-end GUI Agent designed for robust real-world applications.The proposed model family comprises two dense variants (2B and 8B) and one mixture-of-experts variant (30B-A3B) to meet various downstream application scenarios.Compared to our previous version, UI-Venus-1.5 introduces three key technical advances: (1) a comprehensive Mid-Training stage leveraging 10 billion tokens across 30+ datasets to establish foundational GUI semantics; (2) Online Reinforcement Learning with full-trajectory rollouts, aligning training objectives with long-horizon, dynamic navigation in large-scale environments; and (3) a single unified GUI Agent constructed via Model Merging, which synthesizes domain-specific models (grounding, web, and mobile) into one cohesive checkpoint. Extensive evaluations demonstrate that UI-Venus-1.5 establishes new state-of-the-art performance on benchmarks such as ScreenSpot-Pro (69.6%), VenusBench-GD (75.0%), and AndroidWorld (77.6%), significantly outperforming previous strong baselines. In addition, UI-Venus-1.5 demonstrates robust navigation capabilities across a variety of Chinese mobile apps, effectively executing user instructions in real-world scenarios. Code: https://github.com/inclusionAI/UI-Venus; Model: https://huggingface.co/collections/inclusionAI/ui-venus