大規模言語モデルの継続的適応のためのメモリバンク圧縮

要旨

大規模言語モデル（LLM）は、多くの日常応用において中核的技術となっている。しかし、データが進化するにつれ、その知識は急速に陳腐化する。継続学習は、獲得済みの知識を消去することなくLLMに新たな情報を追加することを目的とする。フルファインチューニングのような手法は新規データを組み込めるが、計算コストが高く、既存知識が上書きされる破滅的忘れ込みが生じやすい。メモリ拡張型アプローチはこの問題に対処するため、LLMに外部記憶モジュールであるメモリバンクを装備し、情報を将来の利用に向けて保存する。しかし現実世界では大規模データストリームが流入するため、メモリバンクが絶えず肥大化するという重大な制約に直面する。本論文では、オンライン適応学習中にコードブック最適化戦略によりメモリバンクを圧縮するMBCモデルを提案する。安定的な学習を確保するため、コードブック崩壊を防止するオンラインリセット機構も導入する。さらに、LLMの注意層にKey-Value Low-Rank Adaptationを適用し、圧縮されたメモリ表現を効率的に利用可能にする。ベンチマーク質問応答データセットによる実験では、競合ベースラインと比較してメモリバンクサイズを0.3%に削減しつつ、オンライン適応学習中に高い記憶保持精度を維持できることを実証した。実装コードはhttps://github.com/Thomkat/MBC で公開している。

English

Large Language Models (LLMs) have become a mainstay for many everyday applications. However, as data evolve their knowledge quickly becomes outdated. Continual learning aims to update LLMs with new information without erasing previously acquired knowledge. Although methods such as full fine-tuning can incorporate new data, they are computationally expensive and prone to catastrophic forgetting, where prior knowledge is overwritten. Memory-augmented approaches address this by equipping LLMs with a memory bank, that is an external memory module which stores information for future use. However, these methods face a critical limitation, in particular, the memory bank constantly grows in the real-world scenario when large-scale data streams arrive. In this paper, we propose MBC, a model that compresses the memory bank through a codebook optimization strategy during online adaptation learning. To ensure stable learning, we also introduce an online resetting mechanism that prevents codebook collapse. In addition, we employ Key-Value Low-Rank Adaptation in the attention layers of the LLM, enabling efficient utilization of the compressed memory representations. Experiments with benchmark question-answering datasets demonstrate that MBC reduces the memory bank size to 0.3% when compared against the most competitive baseline, while maintaining high retention accuracy during online adaptation learning. Our code is publicly available at https://github.com/Thomkat/MBC.

大規模言語モデルの継続的適応のためのメモリバンク圧縮

Memory Bank Compression for Continual Adaptation of Large Language Models

要旨

Support