Vers une synthèse évolutive des tâches terminales via des graphes de compétences

Résumé

Les agents en terminal ont démontré un fort potentiel pour l'exécution autonome en ligne de commande, mais leur entraînement reste limité par la rareté des trajectoires d'exécution de haute qualité et diversifiées. Les approches existantes atténuent ce goulot d'étranglement en synthétisant des instances de tâches terminales à grande échelle pour l'échantillonnage de trajectoires. Cependant, elles se concentrent principalement sur l'augmentation du nombre de tâches tout en offrant un contrôle limité sur la diversité des trajectoires d'exécution que les agents expérimentent réellement pendant l'entraînement. Dans cet article, nous présentons SkillSynth, un cadre automatisé pour la synthèse de tâches terminales construit sur un graphe de compétences médiatisé par scénarios. SkillSynth construit d'abord un graphe de compétences à grande échelle, où les scénarios servent de nœuds de transition intermédiaires connectant diverses compétences en ligne de commande. Il échantillonne ensuite des chemins dans ce graphe comme abstractions de workflows réels, et utilise un système multi-agents pour les instancier en instances de tâches exécutables. En ancrant la synthèse de tâches dans des chemins de workflows échantillonnés par graphe, SkillSynth contrôle explicitement la diversité des trajectoires d'exécution minimales requises pour résoudre les tâches synthétisées. Les expériences sur Terminal-Bench démontrent l'efficacité de SkillSynth. De plus, les instances de tâches synthétisées par SkillSynth ont été adoptées pour entraîner Hy3 Preview, contribuant à l'amélioration de ses capacités agentielles dans des environnements basés sur terminal.

English

Terminal agents have demonstrated strong potential for autonomous command-line execution, yet their training remains constrained by the scarcity of high-quality and diverse execution trajectories. Existing approaches mitigate this bottleneck by synthesizing large-scale terminal task instances for trajectory sampling. However, they primarily focus on scaling the number of tasks while providing limited control over the diversity of execution trajectories that agents actually experience during training. In this paper, we present SkillSynth, an automated framework for terminal task synthesis built on a scenario-mediated skill graph. SkillSynth first constructs a large-scale skill graph, where scenarios serve as intermediate transition nodes that connect diverse command-line skills. It then samples paths from this graph as abstractions of real-world workflows, and uses a multi-agent harness to instantiate them into executable task instances. By grounding task synthesis in graph-sampled workflow paths, SkillSynth explicitly controls the diversity of minimal execution trajectories required to solve the synthesized tasks. Experiments on Terminal-Bench demonstrate the effectiveness of SkillSynth. Moreover, task instances synthesized by SkillSynth have been adopted to train Hy3 Preview, contributing to its enhanced agentic capabilities in terminal-based settings.

Vers une synthèse évolutive des tâches terminales via des graphes de compétences

Toward Scalable Terminal Task Synthesis via Skill Graphs

Résumé

Support