AgentSwing : Routage adaptatif de contexte parallèle pour la gestion des agents Web à long horizon

Résumé

Alors que les modèles de langage de grande taille (LLM) évoluent en agents autonomes pour la recherche d'information à long horizon, la gestion d'une capacité de contexte finie est devenue un goulot d'étranglement critique. Les méthodes existantes de gestion du contexte s'engagent généralement dans une stratégie unique et fixe tout au long de la trajectoire entière. Ces conceptions statiques peuvent bien fonctionner dans certains états, mais elles ne peuvent pas s'adapter alors que l'utilité et la fiabilité du contexte accumulé évoluent durant la recherche à long horizon. Pour formaliser ce défi, nous introduisons un cadre probabiliste qui caractérise la réussite à long horizon à travers deux dimensions complémentaires : l'efficacité de la recherche et la précision terminale. S'appuyant sur cette perspective, nous proposons AgentSwing, un cadre de routage adaptatif parallèle pour la gestion du contexte, conscient de l'état. À chaque point de déclenchement, AgentSwing déploie en parallèle plusieurs branches à contexte géré et utilise un routage prospectif pour sélectionner la continuation la plus prometteuse. Les expériences menées sur divers benchmarks et architectures d'agents montrent qu'AgentSwing surpasse constamment les méthodes statiques robustes de gestion du contexte, atteignant souvent ou dépassant leurs performances avec jusqu'à 3 fois moins de tours d'interaction, tout en améliorant également le plafond de performance ultime des agents web à long horizon. Au-delà des gains empiriques, le cadre probabiliste proposé offre une perspective fondamentale pour analyser et concevoir les futures stratégies de gestion du contexte pour les agents à long horizon.

English

As large language models (LLMs) evolve into autonomous agents for long-horizon information-seeking, managing finite context capacity has become a critical bottleneck. Existing context management methods typically commit to a single fixed strategy throughout the entire trajectory. Such static designs may work well in some states, but they cannot adapt as the usefulness and reliability of the accumulated context evolve during long-horizon search. To formalize this challenge, we introduce a probabilistic framework that characterizes long-horizon success through two complementary dimensions: search efficiency and terminal precision. Building on this perspective, we propose AgentSwing, a state-aware adaptive parallel context management routing framework. At each trigger point, AgentSwing expands multiple context-managed branches in parallel and uses lookahead routing to select the most promising continuation. Experiments across diverse benchmarks and agent backbones show that AgentSwing consistently outperforms strong static context management methods, often matching or exceeding their performance with up to 3times fewer interaction turns while also improving the ultimate performance ceiling of long-horizon web agents. Beyond the empirical gains, the proposed probabilistic framework provides a principled lens for analyzing and designing future context management strategies for long-horizon agents.

AgentSwing : Routage adaptatif de contexte parallèle pour la gestion des agents Web à long horizon

AgentSwing: Adaptive Parallel Context Management Routing for Long-Horizon Web Agents

Résumé

Support