修復：基於漸進式適應性干預與再整合的穩健編輯

摘要

大型語言模型（LLMs）的後續訓練面臨著獲取新知識或修正錯誤的高昂成本，以及重新訓練時頻繁出現的意外副作用等限制。為解決這些問題，我們提出了REPAIR（通過漸進式適應性干預與再整合實現的穩健編輯），這是一個旨在支持精確且低成本模型更新，同時保護非目標知識的終身編輯框架。REPAIR通過閉環反饋機制結合動態記憶管理，緩解了大規模序列編輯的不穩定性與衝突。此外，通過融入頻繁的知識融合並實施強局部性保護，REPAIR有效解決了傳統無分佈方法常忽視的意外連鎖反應問題。我們的實驗表明，REPAIR在多個模型家族中提升了10%-30%的編輯準確率，並顯著減少了知識遺忘。這項工作為開發可靠、可擴展且持續進化的大型語言模型引入了一個穩健的框架。

English

Post-training for large language models (LLMs) is constrained by the high cost of acquiring new knowledge or correcting errors and by the unintended side effects that frequently arise from retraining. To address these issues, we introduce REPAIR (Robust Editing via Progressive Adaptive Intervention and Reintegration), a lifelong editing framework designed to support precise and low-cost model updates while preserving non-target knowledge. REPAIR mitigates the instability and conflicts of large-scale sequential edits through a closed-loop feedback mechanism coupled with dynamic memory management. Furthermore, by incorporating frequent knowledge fusion and enforcing strong locality guards, REPAIR effectively addresses the shortcomings of traditional distribution-agnostic approaches that often overlook unintended ripple effects. Our experiments demonstrate that REPAIR boosts editing accuracy by 10%-30% across multiple model families and significantly reduces knowledge forgetting. This work introduces a robust framework for developing reliable, scalable, and continually evolving LLMs.

修復：基於漸進式適應性干預與再整合的穩健編輯

REPAIR: Robust Editing via Progressive Adaptive Intervention and Reintegration

摘要

Support