MUSCLE: 互換性のあるLLM進化のためのモデル更新戦略

要旨

大規模言語モデル（LLMs）は、その性能を向上させるために、データやアーキテクチャの変更に伴い頻繁に更新される。モデルを更新する際、開発者は全体的な性能指標の向上に重点を置くことが多く、以前のモデルバージョンとの互換性にはあまり注意を払わない。しかし、ユーザーは特定の機械学習モデルの機能や能力についてのメンタルモデルを構築しており、更新のたびにそのメンタルモデルを適応させる必要がある。これは負担の大きい作業であり、ユーザーの不満を引き起こす可能性がある。実際には、ファインチューニングされた下流タスクアダプターは、事前学習済みのLLMベースモデルに依存している。これらのベースモデルが更新されると、ユーザー向けの下流タスクモデルはインスタンスの回帰やネガティブフリップ（以前は正しかったインスタンスが誤って予測される現象）を経験する。これは、下流タスクの学習手順が同一であっても発生する。本研究では、ユーザーに対してシームレスなモデル更新を提供することを目的としている。第一に、生成タスクを中心に、識別タスクにも適用可能な、以前のモデルバージョンとの互換性を評価するための指標を提供する。多様なタスクやモデル更新において、異なるモデルバージョン間での回帰や不整合を観察する。第二に、モデル更新における不整合の数を最小化するための学習戦略を提案する。これには、タスクファインチューニングされた言語モデルを強化する互換性モデルの学習が含まれる。Llama 1からLlama 2への更新において、ネガティブフリップ（以前のモデルバージョンでは正しかったが、新しいモデルでは誤っているインスタンス）を最大40%削減する。

English

Large Language Models (LLMs) are frequently updated due to data or architecture changes to improve their performance. When updating models, developers often focus on increasing overall performance metrics with less emphasis on being compatible with previous model versions. However, users often build a mental model of the functionality and capabilities of a particular machine learning model they are interacting with. They have to adapt their mental model with every update -- a draining task that can lead to user dissatisfaction. In practice, fine-tuned downstream task adapters rely on pretrained LLM base models. When these base models are updated, these user-facing downstream task models experience instance regression or negative flips -- previously correct instances are now predicted incorrectly. This happens even when the downstream task training procedures remain identical. Our work aims to provide seamless model updates to a user in two ways. First, we provide evaluation metrics for a notion of compatibility to prior model versions, specifically for generative tasks but also applicable for discriminative tasks. We observe regression and inconsistencies between different model versions on a diverse set of tasks and model updates. Second, we propose a training strategy to minimize the number of inconsistencies in model updates, involving training of a compatibility model that can enhance task fine-tuned language models. We reduce negative flips -- instances where a prior model version was correct, but a new model incorrect -- by up to 40% from Llama 1 to Llama 2.

MUSCLE: 互換性のあるLLM進化のためのモデル更新戦略

MUSCLE: A Model Update Strategy for Compatible LLM Evolution

要旨

Support