경험이 숙련을 만든다: 자기 진화 스킬 메모리를 통한 일반화 가능한 의료 에이전트 추론

초록

의료 에이전트 시스템은 단순한 정적 질문 응답을 넘어 상호작용적 임상 의사 결정을 지원할 것으로 점점 더 기대되고 있다. 이러한 환경에서 효과적인 에이전트는 진화하는 사례에 걸쳐 이전 경험을 재사용해야 하지만, 기존 메모리 메커니즘은 종종 중복되고 잡음이 많으며 통제하기 어려운 원시적 과거 기록을 그대로 유지한다. 더 중요한 점은, 이러한 메커니즘은 향후 추론에 실제로 유용한 메모리를 거의 구분하지 못한다는 것이다. 이는 장기적 임상 추론을 위해 간결하고 신뢰할 수 있는 경험을 축적하는 능력을 제한한다. 이러한 격차를 해소하기 위해, 우리는 모델 가중치를 업데이트하지 않고 스킬 기반 메모리를 통해 의료 에이전트를 개선하는 사후 배포 자기 진화 프레임워크인 SkeMex를 제안한다. SkeMex는 정보 제공적 상호작용 궤적을 재사용 가능한 절차적 지식을 인코딩하는 구조화된 스킬로 추출하고, 이를 일반, 작업 특정, 행동 수준 경험을 포괄하는 다중 분기 저장소로 구성한다. 어떤 메모리를 재사용하고 유지할지 결정하기 위해, SkeMex는 환경 피드백으로부터 맥락 의존적 효용을 추정하고 이를 가치 인식 검색 및 저장소 관리를 안내하는 데 사용한다. 폐쇄 루프 "읽기-쓰기-평가-관리" 생애주기는 새 스킬 작성, 효용 업데이트, 유용한 메모리 촉진, 유해 항목 제거를 통해 지속적 진화를 추가로 지원한다. 다양한 임상 작업에 걸친 실험은 SkeMex가 오프라인 및 온라인 환경 모두에서 대표적인 메모리 기반 에이전트보다 일관되게 우수한 성능을 보임을 입증한다. 또한 다양한 모델 백본에 일반화되며 전이 가능한 스킬 메모리를 지원한다. 모든 데이터와 코드는 공개될 예정이다.

English

Medical agent systems are increasingly expected to support interactive clinical decision making rather than only static question answering. In such settings, effective agents must reuse prior experience across evolving cases, yet existing memory mechanisms often retain raw historical traces that are redundant, noisy, and difficult to govern. More importantly, they rarely distinguish which memories are truly useful for future reasoning. This limits their ability to accumulate compact and reliable experience for long-horizon clinical reasoning. To close this gap, we propose SkeMex, a post-deployment self-evolution framework that improves medical agents through a skill-based memory without updating model weights. SkeMex distills informative interaction trajectories into structured skills that encode reusable procedural knowledge, and organizes them into a multi-branch repository spanning general, task-specific, and action-level experience. To determine which memories should be reused and retained, SkeMex estimates context-dependent utility from environment feedback and uses it to guide value-aware retrieval and repository governance. A closed-loop ``Read--Write--Assess--Govern" lifecycle further supports continual evolution by writing new skills, updating utilities, promoting useful memories, and removing harmful entries. Experiments across diverse clinical tasks show that SkeMex consistently outperforms representative memory-based agents in both offline and online settings. It also generalizes across model backbones and supports transferable skill memory. All data and code will be released publicly.