HarnessX: 一种可组合、自适应且可演化的智能体框架铸造平台
HarnessX: A Composable, Adaptive, and Evolvable Agent Harness Foundry
June 12, 2026
作者: Tingyang Chen, Shuo Lu, Kang Zhao, Weicheng Meng, Hanlin Teng, Tianhao Li, Chao Li, Xule Liu, Jian Liang, Zhizhong Zhang, Yuan Xie, Heng Qu, Kun Shao, Jian Luan
cs.AI
摘要
AI代理的性能关键取决于运行时支架,包括提示、工具、记忆和控制流,这些组件中介了模型如何观察、推理和行动。然而,当前的支架仍大多依赖手工构建且静态固化:每个新模型或新任务仍需定制化的支撑框架,执行过程中产生的丰富轨迹也鲜少被提炼为系统性改进。我们提出HarnessX——一个可组合、自适应且可进化的代理支架铸造平台。HarnessX通过替换代数组装类型化支架原语,借助AEGIS(一种基于轨迹驱动的多代理进化引擎,在符号适应与强化学习之间建立操作镜像)实现自适应调整,并通过将轨迹转化为支架更新与模型训练信号,闭合支架-模型循环。在五项基准测试(ALFWorld、GAIA、WebShop、tau^3-Bench及SWE-bench Verified)中,HarnessX平均提升+14.5%(最高达+44.0%),且基线越低进步越显著。这些结果表明,代理性能的提升不必仅依赖模型规模扩展:基于执行反馈组合并进化运行时接口,是一个可操作且互补的杠杆。完整代码库将在未来版本中开源。
English
AI agent performance depends critically on the runtime harness, comprising the prompts, tools, memory, and control flow that mediate how a model observes, reasons, and acts. Yet today's harnesses remain largely hand-crafted and static: each new model or task still demands bespoke scaffolding, and the rich traces produced during execution are rarely distilled back into systematic improvement. We introduce HarnessX, a foundry for composable, adaptive, and evolvable agent harnesses. HarnessX assembles typed harness primitives via a substitution algebra, adapts them through AEGIS, a trace-driven multi-agent evolution engine grounded in an operational mirror between symbolic adaptation and reinforcement learning, and closes the harness-model loop by turning trajectories into both harness updates and model training signal. Across five benchmarks (ALFWorld, GAIA, WebShop, tau^3-Bench, and SWE-bench Verified), HarnessX yields an average gain of +14.5% (up to +44.0%), with gains largest where baselines are lowest. These results suggest that agent progress need not come from model scaling alone: composing and evolving runtime interfaces from execution feedback is an actionable and complementary lever. The complete codebase will be open-sourced in a future release.