大規模言語モデルの統合に向けたモデル類似性の探索

要旨

モデルの統合は、大規模言語モデル（LLM）の機能と効率を向上させるための主要技術の1つとなっています。ただし、任意の2つのモデルを統合する際の期待される性能向上と原則に関する理解は限られています。本研究では、生物学的進化に類似した、LLM間の類似性や関連性の程度である「モデルの親族関係」を導入します。包括的な実証分析により、モデルの親族関係とモデル統合後の性能向上との間に一定の関係があることがわかりました。これは、候補モデルの選択を導くのに役立ちます。この着想を受けて、新しいモデル統合戦略を提案します。モデルの親族関係を考慮したTop-k Greedy Mergingは、ベンチマークデータセットでより優れた性能を発揮できます。具体的には、モデルの親族関係を基準とすることで、モデル統合を継続的に行うことができ、モデル進化の劣化（局所最適解）を緩和するのに役立ちます。また、モデルの親族関係はこれらの罠を回避するための指針となり得ます。コードはhttps://github.com/zjunlp/ModelKinship で入手可能です。

English

Model merging has become one of the key technologies for enhancing the capabilities and efficiency of Large Language Models (LLMs). However, our understanding of the expected performance gains and principles when merging any two models remains limited. In this work, we introduce model kinship, the degree of similarity or relatedness between LLMs, analogous to biological evolution. With comprehensive empirical analysis, we find that there is a certain relationship between model kinship and the performance gains after model merging, which can help guide our selection of candidate models. Inspired by this, we propose a new model merging strategy: Top-k Greedy Merging with Model Kinship, which can yield better performance on benchmark datasets. Specifically, we discover that using model kinship as a criterion can assist us in continuously performing model merging, alleviating the degradation (local optima) in model evolution, whereas model kinship can serve as a guide to escape these traps. Code is available at https://github.com/zjunlp/ModelKinship.

大規模言語モデルの統合に向けたモデル類似性の探索

Exploring Model Kinship for Merging Large Language Models

要旨

Support