SPARC:在LLMs中實現穩健持續學習的子空間感知提示適應
SPARC: Subspace-Aware Prompt Adaptation for Robust Continual Learning in LLMs
February 5, 2025
作者: Dinithi Jayasuriya, Sina Tayebati, Davide Ettori, Ranganath Krishnan, Amit Ranjan Trivedi
cs.AI
摘要
我們提出了一個名為SPARC的輕量級持續學習框架,適用於大型語言模型(LLMs),通過在低維空間中進行提示調整實現有效的任務適應。通過利用主成分分析(PCA),我們識別出訓練數據的一個緊湊子空間。在這個低維空間中優化提示可以增強訓練效率,因為它專注於最相關的特徵更新,同時減少計算開銷。此外,由於模型的內部結構保持不變,因此從預訓練中獲得的豐富知識得以完全保留,確保在適應過程中不會損害先前學到的信息。我們的方法在任務增量和領域增量的持續學習設置中實現了高知識保留,同時僅微調了模型參數的0.04%。此外,通過集成LoRA,我們增強了對計算限制的適應性,允許在準確性和訓練成本之間進行權衡。在SuperGLUE基準測試上的實驗表明,我們基於PCA的提示調整結合LoRA可以保持完整的知識保留,同時提高準確性,僅利用模型參數的1%。這些結果確立了我們的方法作為大型語言模型持續學習的可擴展和資源高效的解決方案。
English
We propose SPARC, a lightweight continual learning framework for large
language models (LLMs) that enables efficient task adaptation through prompt
tuning in a lower-dimensional space. By leveraging principal component analysis
(PCA), we identify a compact subspace of the training data. Optimizing prompts
in this lower-dimensional space enhances training efficiency, as it focuses
updates on the most relevant features while reducing computational overhead.
Furthermore, since the model's internal structure remains unaltered, the
extensive knowledge gained from pretraining is fully preserved, ensuring that
previously learned information is not compromised during adaptation. Our method
achieves high knowledge retention in both task-incremental and
domain-incremental continual learning setups while fine-tuning only 0.04% of
the model's parameters. Additionally, by integrating LoRA, we enhance
adaptability to computational constraints, allowing for a tradeoff between
accuracy and training cost. Experiments on the SuperGLUE benchmark demonstrate
that our PCA-based prompt tuning combined with LoRA maintains full knowledge
retention while improving accuracy, utilizing only 1% of the model's
parameters. These results establish our approach as a scalable and
resource-efficient solution for continual learning in LLMs.Summary
AI-Generated Summary