「LoRAはどのように記憶するのか？ LLMファインチューニングのためのパラメトリック記憶則」

要旨

大規模言語モデル（LLM）は、動的な実世界環境において効果的であり続けるために、継続的に学習し知識を更新する必要がある。LoRA（Low-Rank Adaptation）はそのようなメモリ更新に広く用いられているが、既存の研究は主に定性的な下流評価に依存しており、厳密なパラメトリックメモリの定量的な容量限界やその基盤となるダイナミクスはほとんど解明されていない。このギャップを埋めるべく、我々は潜在空間内でLoRAを制御されたメモリ容量プローブとして活用し、厳密なパラメトリックメモリを体系的に定量化する。我々は、損失低減ΔLと有効パラメータ数、シーケンス長を結びつける頑健な冪乗則である「パラメトリックメモリの法則」を導入する。トークンレベルでの詳細な分析は、決定論的な相転移を明らかにし、貪欲デコードにおける逐語的記憶には、予測確率p > 0.5が十分条件であることを示す。これらの知見に基づき、我々はMemFTを導入する。これは、閾値に基づく最適化戦略であり、トレーニング予算を閾値未満のトークンに動的に再分配する。実証評価により、MemFTがメモリの忠実性と効率性を向上させることが示される。コードはhttps://github.com/zjunlp/ParametricMemoryLaw で公開される予定である。

English

Large Language Models (LLMs) must continuously learn and update knowledge to remain effective in dynamic real-world environments. While Low-Rank Adaptation (LoRA) is widely used for such memory updates, existing studies mainly rely on qualitative downstream evaluations, leaving the quantitative capacity limits and underlying dynamics of exact parametric memory largely unexplored. To bridge this gap, we employ LoRA as a controlled memory capacity probe within the latent space to systematically quantify exact parametric memory. We introduce the Parametric Memory Law, a robust power law linking loss reduction Delta L to effective parameters and sequence length. At the token level, fine-grained analysis reveals a deterministic phase transition, demonstrating that a prediction probability of p > 0.5 constitutes a sufficient condition for verbatim recall under greedy decoding. Driven by these insights, we introduce MemFT, a threshold-guided optimization strategy that dynamically redistributes the training budget toward sub-threshold tokens. Empirical evaluations demonstrate that MemFT can enhance memory fidelity and efficiency. Code will be released at https://github.com/zjunlp/ParametricMemoryLaw.