经验生巧：通过自进化技能记忆赋能可泛化的医疗智能体推理

摘要

医疗代理系统正日益被期望支持交互式临床决策，而不仅仅是静态的问答。在此类场景中，有效的代理必须能够在不断演变的病例中复用先前的经验，然而现有的记忆机制往往保留原始的记录轨迹，这些轨迹冗余、嘈杂且难以管控。更重要的是，它们很少区分哪些记忆对未来推理真正有用。这限制了它们为长期临床推理积累紧凑且可靠经验的能力。为弥补这一差距，我们提出了SkeMex——一种部署后的自我进化框架，通过基于技能的记忆（无需更新模型权重）来改进医疗代理。SkeMex将信息丰富的交互轨迹提炼为结构化技能，编码可复用的程序性知识，并将其组织成一个多分支存储库，涵盖通用经验、任务特定经验和动作级经验。为确定哪些记忆应被复用和保留，SkeMex根据环境反馈估计上下文相关的效用，并以此指导价值感知的检索和存储库治理。一个闭环的“读取-写入-评估-治理”生命周期通过写入新技能、更新效用值、推广有用记忆和移除有害条目，进一步支持持续进化。跨多种临床任务的实验表明，SkeMex在离线和在线场景中均持续优于代表性的基于记忆的代理。它还能在不同模型主干上泛化，并支持可迁移的技能记忆。所有数据和代码将公开发布。

English

Medical agent systems are increasingly expected to support interactive clinical decision making rather than only static question answering. In such settings, effective agents must reuse prior experience across evolving cases, yet existing memory mechanisms often retain raw historical traces that are redundant, noisy, and difficult to govern. More importantly, they rarely distinguish which memories are truly useful for future reasoning. This limits their ability to accumulate compact and reliable experience for long-horizon clinical reasoning. To close this gap, we propose SkeMex, a post-deployment self-evolution framework that improves medical agents through a skill-based memory without updating model weights. SkeMex distills informative interaction trajectories into structured skills that encode reusable procedural knowledge, and organizes them into a multi-branch repository spanning general, task-specific, and action-level experience. To determine which memories should be reused and retained, SkeMex estimates context-dependent utility from environment feedback and uses it to guide value-aware retrieval and repository governance. A closed-loop ``Read--Write--Assess--Govern" lifecycle further supports continual evolution by writing new skills, updating utilities, promoting useful memories, and removing harmful entries. Experiments across diverse clinical tasks show that SkeMex consistently outperforms representative memory-based agents in both offline and online settings. It also generalizes across model backbones and supports transferable skill memory. All data and code will be released publicly.