CASCADE: 배포 중 대규모 언어 모델을 위한 사례 기반 지속적 적응

초록

대규모 언어 모델(LLM)은 현대 인공지능의 핵심 기반이 되었지만, 그 생애 주기는 훈련과 배포라는 엄격한 분리에 의해 제약되어 있으며, 배포 이후에는 학습이 사실상 중단된다. 이러한 한계는 환경과의 상호작용을 통해 지속적으로 적응하는 자연 지능과 대조된다. 본 논문에서는 LLM 생애 주기의 세 번째 단계로서 배포 시 학습(DTL)을 공식화하며, 이는 LLM 에이전트가 모델 파라미터를 수정하지 않고 배포 중 경험을 통해 개선될 수 있도록 한다. 우리는 CASCADE(CASe-based Continual Adaptation during DEployment)를 제시하는데, 이는 LLM 에이전트에 명시적이고 진화하는 에피소드 기억을 장착하는 일반적이고 원칙적인 프레임워크이다. CASCADE는 경험 재사용을 맥락적 밴디트 문제로 공식화하여 원칙적인 탐색-활용 균형을 가능하게 하고 장기적 상호작용에 대해 무후회 보장을 확립한다. 이러한 설계는 에이전트가 작업 관련 사례를 축적, 선택 및 정제할 수 있게 하여 과거 경험을 실행 가능한 지식으로 전환한다. 의료 진단, 법률 분석, 코드 생성, 웹 검색, 도구 사용, 및 체화된 상호작용을 아우르는 16개의 다양한 작업에서 CASCADE는 제로샷 프롬프팅 대비 거시 평균 성공률을 20.9% 향상시키며, 그래디언트 기반 및 메모리 기반 기준선을 일관되게 능가한다. 배포를 적응적 학습 과정으로 재구성함으로써, 이 연구는 지속적으로 개선되는 AI 시스템을 위한 기초를 확립한다.

English

Large language models (LLMs) have become a central foundation of modern artificial intelligence, yet their lifecycle remains constrained by a rigid separation between training and deployment, after which learning effectively ceases. This limitation contrasts with natural intelligence, which continually adapts through interaction with its environment. In this paper, we formalise deployment-time learning (DTL) as the third stage in the LLM lifecycle that enables LLM agents to improve from experience during deployment without modifying model parameters. We present CASCADE (CASe-based Continual Adaptation during DEployment), a general and principled framework that equips LLM agents with an explicit, evolving episodic memory. CASCADE formulates experience reuse as a contextual bandit problem, enabling principled exploration-exploitation trade-offs and establishing no-regret guarantees over long-term interactions. This design allows agents to accumulate, select, and refine task-relevant cases, transforming past experience into actionable knowledge. Across 16 diverse tasks spanning medical diagnosis, legal analysis, code generation, web search, tool use, and embodied interaction, CASCADE improves macro-averaged success rate by 20.9% over zero-shot prompting while consistently outperforming gradient-based and memory-based baselines. By reframing deployment as an adaptive learning process, this work establishes a foundation for continually improving AI systems.

CASCADE: 배포 중 대규모 언어 모델을 위한 사례 기반 지속적 적응

CASCADE: Case-Based Continual Adaptation for Large Language Models During Deployment

초록

Support