在持续学习之前进行适应
Adapt before Continual Learning
June 4, 2025
作者: Aojun Lu, Tao Feng, Hangjie Yuan, Chunhui Ding, Yanan Sun
cs.AI
摘要
持续学习(Continual Learning, CL)旨在使神经网络能够逐步获取新知识(可塑性)同时保留已有知识(稳定性)。尽管预训练模型(Pre-trained Models, PTMs)在CL中扮演了关键角色,但主流方法为保持稳定性而冻结PTM主干,这限制了其可塑性,尤其是在增量任务中遇到显著领域差异时。相反,若顺序微调整个PTM,则可能引发通用知识的灾难性遗忘,凸显了稳定性与可塑性之间的关键权衡。为应对这一挑战,我们提出了在核心CL过程之前进行PTM适配(Adapting PTMs before the core CL process, ACL)的新框架,该框架通过一个即插即用的适配阶段,在利用现有CL方法(如提示调优)学习每个新任务前,精炼PTM主干。ACL通过将嵌入向量与其原始类别原型对齐,同时远离其他类别,增强了可塑性,理论上与实验均表明其能平衡稳定性与可塑性。大量实验证明,ACL显著提升了CL在各类基准测试及集成方法中的表现,为基于PTM的CL提供了一个通用解决方案。
English
Continual Learning (CL) seeks to enable neural networks to incrementally
acquire new knowledge (plasticity) while retaining existing knowledge
(stability). While pre-trained models (PTMs) have become pivotal in CL,
prevailing approaches freeze the PTM backbone to preserve stability, limiting
their plasticity, particularly when encountering significant domain gaps in
incremental tasks. Conversely, sequentially finetuning the entire PTM risks
catastrophic forgetting of generalizable knowledge, exposing a critical
stability-plasticity trade-off. To address this challenge, we propose Adapting
PTMs before the core CL process (ACL), a novel framework that refines the PTM
backbone through a plug-and-play adaptation phase before learning each new task
with existing CL approaches (e.g., prompt tuning). ACL enhances plasticity by
aligning embeddings with their original class prototypes while distancing them
from others, theoretically and empirically shown to balance stability and
plasticity. Extensive experiments demonstrate that ACL significantly improves
CL performance across benchmarks and integrated methods, offering a versatile
solution for PTM-based CL.