ChatPaper.aiChatPaper

Surfer-H 與 Holo1 相遇:由開源權重驅動的高效成本網絡代理

Surfer-H Meets Holo1: Cost-Efficient Web Agent Powered by Open Weights

June 3, 2025
作者: Mathieu Andreux, Breno Baldas Skuk, Hamza Benchekroun, Emilien Biré, Antoine Bonnet, Riaz Bordie, Matthias Brunel, Pierre-Louis Cedoz, Antoine Chassang, Mickaël Chen, Alexandra D. Constantinou, Antoine d'Andigné, Hubert de La Jonquière, Aurélien Delfosse, Ludovic Denoyer, Alexis Deprez, Augustin Derupti, Michael Eickenberg, Mathïs Federico, Charles Kantor, Xavier Koegler, Yann Labbé, Matthew C. H. Lee, Erwan Le Jumeau de Kergaradec, Amir Mahla, Avshalom Manevich, Adrien Maret, Charles Masson, Rafaël Maurin, Arturo Mena, Philippe Modard, Axel Moyal, Axel Nguyen Kerbel, Julien Revelle, Mats L. Richter, María Santos, Laurent Sifre, Maxime Theillard, Marc Thibault, Louis Thiry, Léo Tronchon, Nicolas Usunier, Tony Wu
cs.AI

摘要

我們推出Surfer-H,這是一款成本效益高的網路代理,整合了視覺-語言模型(VLM)來執行用戶定義的網路任務。我們將其與Holo1配對,這是一個新的開放權重VLM集合,專門用於網路導航和資訊提取。Holo1在精心策劃的數據源上進行了訓練,包括開放存取的網路內容、合成範例以及自產的代理數據。Holo1在通用用戶界面(UI)基準測試以及我們新的網路UI定位基準測試WebClick中均名列前茅。當由Holo1驅動時,Surfer-H在WebVoyager上達到了92.2%的頂尖性能,在準確性和成本效益之間實現了帕累托最優平衡。為了加速代理系統的研究進展,我們將開源我們的WebClick評估數據集和Holo1模型權重。
English
We present Surfer-H, a cost-efficient web agent that integrates Vision-Language Models (VLM) to perform user-defined tasks on the web. We pair it with Holo1, a new open-weight collection of VLMs specialized in web navigation and information extraction. Holo1 was trained on carefully curated data sources, including open-access web content, synthetic examples, and self-produced agentic data. Holo1 tops generalist User Interface (UI) benchmarks as well as our new web UI localization benchmark, WebClick. When powered by Holo1, Surfer-H achieves a 92.2% state-of-the-art performance on WebVoyager, striking a Pareto-optimal balance between accuracy and cost-efficiency. To accelerate research advancement in agentic systems, we are open-sourcing both our WebClick evaluation dataset and the Holo1 model weights.
PDF313June 6, 2025