编辑批量大小越大就一定越好吗?——基于Llama-3模型编辑的实证研究
Is Bigger Edit Batch Size Always Better? -- An Empirical Study on Model Editing with Llama-3
May 1, 2024
作者: Junsang Yoon, Akshat Gupta, Gopala Anumanchipalli
cs.AI
摘要
本研究提出了一个针对最新的大型语言模型 Llama-3 的有针对性的模型编辑分析。我们探讨了针对精确层干预设计的流行模型编辑技术 - ROME、MEMIT 和 EMMET 的有效性。通过涵盖三种不同策略的评估,即顺序编辑、批量编辑和我们称之为顺序-批量编辑的混合方法,我们确定了最有效的层以进行有针对性的编辑,总共进行了高达 4096 次编辑。我们的研究结果表明,增加编辑批次大小可能会比依次使用较小的编辑批次对相同数量的编辑更显著地降低模型性能。基于这一点,我们认为顺序模型编辑是扩展模型编辑方法的重要组成部分,未来的研究应该专注于结合批量和顺序编辑的方法。这一观察结果表明了当前模型编辑方法存在的潜在局限性,即倾向于增加编辑批次大小,我们希望这为未来对批量大小和模型编辑性能进行优化的研究铺平道路。
English
This study presents a targeted model editing analysis focused on the latest
large language model, Llama-3. We explore the efficacy of popular model editing
techniques - ROME, MEMIT, and EMMET, which are designed for precise layer
interventions. We identify the most effective layers for targeted edits through
an evaluation that encompasses up to 4096 edits across three distinct
strategies: sequential editing, batch editing, and a hybrid approach we call as
sequential-batch editing. Our findings indicate that increasing edit
batch-sizes may degrade model performance more significantly than using smaller
edit batches sequentially for equal number of edits. With this, we argue that
sequential model editing is an important component for scaling model editing
methods and future research should focus on methods that combine both batched
and sequential editing. This observation suggests a potential limitation in
current model editing methods which push towards bigger edit batch sizes, and
we hope it paves way for future investigations into optimizing batch sizes and
model editing performance.Summary
AI-Generated Summary