適應於持續學習之前
Adapt before Continual Learning
June 4, 2025
作者: Aojun Lu, Tao Feng, Hangjie Yuan, Chunhui Ding, Yanan Sun
cs.AI
摘要
持續學習(Continual Learning, CL)旨在使神經網絡能夠逐步獲取新知識(可塑性),同時保留現有知識(穩定性)。儘管預訓練模型(Pre-trained Models, PTMs)在CL中已成為關鍵,但主流方法為了保持穩定性而凍結PTM骨幹,這限制了其可塑性,尤其是在增量任務中遇到顯著領域差距時。相反,若順序微調整個PTM,則可能導致可泛化知識的災難性遺忘,暴露出一個關鍵的穩定性與可塑性之間的權衡問題。為應對這一挑戰,我們提出了在核心CL過程之前進行PTM適應(Adapting PTMs before the core CL process, ACL),這是一種新穎的框架,在利用現有CL方法(例如提示調優)學習每個新任務之前,通過即插即用的適應階段來精煉PTM骨幹。ACL通過將嵌入與其原始類別原型對齊,同時遠離其他類別,從而增強了可塑性,理論和實驗均表明這能平衡穩定性與可塑性。大量實驗證明,ACL顯著提升了跨基準和集成方法的CL性能,為基於PTM的CL提供了一種多功能的解決方案。
English
Continual Learning (CL) seeks to enable neural networks to incrementally
acquire new knowledge (plasticity) while retaining existing knowledge
(stability). While pre-trained models (PTMs) have become pivotal in CL,
prevailing approaches freeze the PTM backbone to preserve stability, limiting
their plasticity, particularly when encountering significant domain gaps in
incremental tasks. Conversely, sequentially finetuning the entire PTM risks
catastrophic forgetting of generalizable knowledge, exposing a critical
stability-plasticity trade-off. To address this challenge, we propose Adapting
PTMs before the core CL process (ACL), a novel framework that refines the PTM
backbone through a plug-and-play adaptation phase before learning each new task
with existing CL approaches (e.g., prompt tuning). ACL enhances plasticity by
aligning embeddings with their original class prototypes while distancing them
from others, theoretically and empirically shown to balance stability and
plasticity. Extensive experiments demonstrate that ACL significantly improves
CL performance across benchmarks and integrated methods, offering a versatile
solution for PTM-based CL.