选择与融合：面向大语言模型的可适应与可扩展命名实体识别

摘要

监督微调（SFT）被广泛用于将大型语言模型（LLMs）与信息抽取（IE）任务对齐，例如命名实体识别（NER）。然而，标注此类细粒度标签并训练领域特定模型的成本高昂。现有研究通常跨多个领域训练统一模型，但此类方法缺乏适应性和可扩展性，因为并非所有训练数据都对目标领域有益，且扩展已训练模型仍具挑战性。我们提出了SaM框架，该框架在推理时动态选择和合并专家模型。具体而言，针对目标领域，我们基于（i）与目标领域的相似度和（ii）在采样实例上的表现，分别从现有领域预训练的领域特定专家中进行选择。随后，这些专家被合并以创建针对目标领域优化的任务特定模型。通过动态合并对目标领域有益的专家，我们无需额外训练即可提升跨领域的泛化能力。此外，专家模型可以便捷地添加或移除，从而具备极佳的可扩展性。在多个基准测试上的广泛实验证明了我们框架的有效性，其平均性能优于统一模型10%。我们还深入探讨了框架的潜在改进方向、实践经验及其扩展应用。

English

Supervised fine-tuning (SFT) is widely used to align large language models (LLMs) with information extraction (IE) tasks, such as named entity recognition (NER). However, annotating such fine-grained labels and training domain-specific models is costly. Existing works typically train a unified model across multiple domains, but such approaches lack adaptation and scalability since not all training data benefits target domains and scaling trained models remains challenging. We propose the SaM framework, which dynamically Selects and Merges expert models at inference time. Specifically, for a target domain, we select domain-specific experts pre-trained on existing domains based on (i) domain similarity to the target domain and (ii) performance on sampled instances, respectively. The experts are then merged to create task-specific models optimized for the target domain. By dynamically merging experts beneficial to target domains, we improve generalization across various domains without extra training. Additionally, experts can be added or removed conveniently, leading to great scalability. Extensive experiments on multiple benchmarks demonstrate our framework's effectiveness, which outperforms the unified model by an average of 10%. We further provide insights into potential improvements, practical experience, and extensions of our framework.

选择与融合：面向大语言模型的可适应与可扩展命名实体识别

Selecting and Merging: Towards Adaptable and Scalable Named Entity Recognition with Large Language Models

摘要

Support