選擇與融合：邁向基於大型語言模型的可適應與可擴展命名實體識別

摘要

監督式微調（SFT）被廣泛用於將大型語言模型（LLMs）與信息抽取（IE）任務對齊，例如命名實體識別（NER）。然而，標註此類細粒度標籤並訓練特定領域模型的成本高昂。現有工作通常跨多個領域訓練統一模型，但這種方法缺乏適應性和可擴展性，因為並非所有訓練數據都對目標領域有益，且擴展已訓練模型仍具挑戰性。我們提出了SaM框架，該框架在推理時動態選擇並合併專家模型。具體而言，針對目標領域，我們根據（i）與目標領域的相似性和（ii）在採樣實例上的表現，分別選擇預訓練於現有領域的特定領域專家。然後，這些專家被合併以創建針對目標領域優化的任務特定模型。通過動態合併對目標領域有益的專家，我們無需額外訓練即可提升跨領域的泛化能力。此外，專家可以方便地添加或移除，從而實現極佳的可擴展性。在多個基準上的廣泛實驗證明了我們框架的有效性，其平均性能優於統一模型10%。我們還進一步探討了潛在的改進、實踐經驗以及框架的擴展方向。

English

Supervised fine-tuning (SFT) is widely used to align large language models (LLMs) with information extraction (IE) tasks, such as named entity recognition (NER). However, annotating such fine-grained labels and training domain-specific models is costly. Existing works typically train a unified model across multiple domains, but such approaches lack adaptation and scalability since not all training data benefits target domains and scaling trained models remains challenging. We propose the SaM framework, which dynamically Selects and Merges expert models at inference time. Specifically, for a target domain, we select domain-specific experts pre-trained on existing domains based on (i) domain similarity to the target domain and (ii) performance on sampled instances, respectively. The experts are then merged to create task-specific models optimized for the target domain. By dynamically merging experts beneficial to target domains, we improve generalization across various domains without extra training. Additionally, experts can be added or removed conveniently, leading to great scalability. Extensive experiments on multiple benchmarks demonstrate our framework's effectiveness, which outperforms the unified model by an average of 10%. We further provide insights into potential improvements, practical experience, and extensions of our framework.

選擇與融合：邁向基於大型語言模型的可適應與可擴展命名實體識別

Selecting and Merging: Towards Adaptable and Scalable Named Entity Recognition with Large Language Models

摘要

Support