選択と統合：大規模言語モデルを用いた適応性と拡張性のある固有表現認識に向けて

要旨

教師ありファインチューニング（SFT）は、大規模言語モデル（LLM）を情報抽出（IE）タスク、例えば固有表現認識（NER）などに適合させるために広く用いられています。しかし、このような細かいラベルのアノテーションやドメイン固有のモデルの訓練にはコストがかかります。既存の研究では、通常、複数のドメインにわたって統一されたモデルを訓練しますが、このようなアプローチは適応性と拡張性に欠けています。なぜなら、すべての訓練データがターゲットドメインに有益であるとは限らず、訓練済みモデルのスケーリングも依然として課題だからです。本論文では、推論時に専門家モデルを動的に選択し統合するSaMフレームワークを提案します。具体的には、ターゲットドメインに対して、（i）ターゲットドメインとのドメイン類似性と（ii）サンプルインスタンスでの性能に基づいて、既存のドメインで事前訓練されたドメイン固有の専門家を選択します。その後、専門家を統合して、ターゲットドメインに最適化されたタスク固有のモデルを作成します。ターゲットドメインに有益な専門家を動的に統合することで、追加の訓練なしにさまざまなドメインでの汎化性能を向上させます。さらに、専門家を簡単に追加または削除できるため、高い拡張性を実現します。複数のベンチマークでの大規模な実験により、本フレームワークの有効性が実証され、統一モデルを平均10％上回る性能を示しました。また、本フレームワークの潜在的な改善点、実践的な経験、および拡張についての洞察を提供します。

English

Supervised fine-tuning (SFT) is widely used to align large language models (LLMs) with information extraction (IE) tasks, such as named entity recognition (NER). However, annotating such fine-grained labels and training domain-specific models is costly. Existing works typically train a unified model across multiple domains, but such approaches lack adaptation and scalability since not all training data benefits target domains and scaling trained models remains challenging. We propose the SaM framework, which dynamically Selects and Merges expert models at inference time. Specifically, for a target domain, we select domain-specific experts pre-trained on existing domains based on (i) domain similarity to the target domain and (ii) performance on sampled instances, respectively. The experts are then merged to create task-specific models optimized for the target domain. By dynamically merging experts beneficial to target domains, we improve generalization across various domains without extra training. Additionally, experts can be added or removed conveniently, leading to great scalability. Extensive experiments on multiple benchmarks demonstrate our framework's effectiveness, which outperforms the unified model by an average of 10%. We further provide insights into potential improvements, practical experience, and extensions of our framework.

選択と統合：大規模言語モデルを用いた適応性と拡張性のある固有表現認識に向けて

Selecting and Merging: Towards Adaptable and Scalable Named Entity Recognition with Large Language Models

要旨

Support