ChatPaper.aiChatPaper

HarmonyGuard:通过自适应策略增强与双目标优化实现网络代理的安全性与实用性

HarmonyGuard: Toward Safety and Utility in Web Agents via Adaptive Policy Enhancement and Dual-Objective Optimization

August 6, 2025
作者: Yurun Chen, Xavier Hu, Yuhan Liu, Keting Yin, Juncheng Li, Zhuosheng Zhang, Shengyu Zhang
cs.AI

摘要

大型语言模型使得智能体能够在开放网络环境中自主执行任务。然而,随着网络内部潜在威胁的不断演变,网络智能体在长时间序列操作中面临着平衡任务执行与新兴风险的挑战。尽管这一挑战至关重要,但当前研究仍局限于单目标优化或单轮次场景,缺乏在网络环境中协同优化安全性与实用性的能力。为填补这一空白,我们提出了HarmonyGuard,一个多智能体协作框架,通过策略增强与目标优化共同提升实用性与安全性。HarmonyGuard具备多智能体架构,其核心能力体现在两个方面:(1) 自适应策略增强:我们在HarmonyGuard中引入了策略智能体,它能够自动从非结构化外部文档中提取并维护结构化安全策略,同时根据威胁演变持续更新策略。(2) 双目标优化:基于安全性与实用性的双重目标,HarmonyGuard集成的实用智能体执行马尔可夫实时推理以评估目标,并利用元认知能力进行优化。在多个基准测试上的广泛评估表明,HarmonyGuard相较于现有基线,策略合规性提升高达38%,任务完成率提升高达20%,且在所有任务中实现了超过90%的策略合规率。我们的项目可在此访问:https://github.com/YurunChen/HarmonyGuard。
English
Large language models enable agents to autonomously perform tasks in open web environments. However, as hidden threats within the web evolve, web agents face the challenge of balancing task performance with emerging risks during long-sequence operations. Although this challenge is critical, current research remains limited to single-objective optimization or single-turn scenarios, lacking the capability for collaborative optimization of both safety and utility in web environments. To address this gap, we propose HarmonyGuard, a multi-agent collaborative framework that leverages policy enhancement and objective optimization to jointly improve both utility and safety. HarmonyGuard features a multi-agent architecture characterized by two fundamental capabilities: (1) Adaptive Policy Enhancement: We introduce the Policy Agent within HarmonyGuard, which automatically extracts and maintains structured security policies from unstructured external documents, while continuously updating policies in response to evolving threats. (2) Dual-Objective Optimization: Based on the dual objectives of safety and utility, the Utility Agent integrated within HarmonyGuard performs the Markovian real-time reasoning to evaluate the objectives and utilizes metacognitive capabilities for their optimization. Extensive evaluations on multiple benchmarks show that HarmonyGuard improves policy compliance by up to 38% and task completion by up to 20% over existing baselines, while achieving over 90% policy compliance across all tasks. Our project is available here: https://github.com/YurunChen/HarmonyGuard.
PDF72August 7, 2025