大きな編集バッチサイズは常に良いのか？ -- Llama-3を用いたモデル編集に関する実証的研究

要旨

本研究では、最新の大規模言語モデルLlama-3に焦点を当てたターゲットモデル編集分析を提示します。我々は、特定の層への介入を目的とした人気のモデル編集手法（ROME、MEMIT、EMMET）の有効性を探ります。3つの異なる戦略（逐次編集、バッチ編集、および我々がシーケンシャル・バッチ編集と呼ぶハイブリッドアプローチ）を用いて、最大4096回の編集を実施し、ターゲット編集に最も効果的な層を特定しました。我々の調査結果は、編集回数が同じ場合、小さな編集バッチを逐次的に適用する方が、大きな編集バッチを使用するよりもモデルの性能低下が少ないことを示しています。これにより、逐次モデル編集はモデル編集手法をスケールする上で重要な要素であり、今後の研究はバッチ編集と逐次編集を組み合わせた手法に焦点を当てるべきであると主張します。この観察は、現在のモデル編集手法が大きな編集バッチサイズを推し進める傾向にあることの潜在的な限界を示唆しており、バッチサイズとモデル編集性能の最適化に向けた今後の研究の道を開くことを期待しています。

English

This study presents a targeted model editing analysis focused on the latest large language model, Llama-3. We explore the efficacy of popular model editing techniques - ROME, MEMIT, and EMMET, which are designed for precise layer interventions. We identify the most effective layers for targeted edits through an evaluation that encompasses up to 4096 edits across three distinct strategies: sequential editing, batch editing, and a hybrid approach we call as sequential-batch editing. Our findings indicate that increasing edit batch-sizes may degrade model performance more significantly than using smaller edit batches sequentially for equal number of edits. With this, we argue that sequential model editing is an important component for scaling model editing methods and future research should focus on methods that combine both batched and sequential editing. This observation suggests a potential limitation in current model editing methods which push towards bigger edit batch sizes, and we hope it paves way for future investigations into optimizing batch sizes and model editing performance.

大きな編集バッチサイズは常に良いのか？ -- Llama-3を用いたモデル編集に関する実証的研究

Is Bigger Edit Batch Size Always Better? -- An Empirical Study on Model Editing with Llama-3

要旨

Support