赋能大型语言模型在工业领域特定问题回答上表现更好

摘要

大型语言模型（LLM）在开放领域任务中备受青睐并取得了显著成就，但在实际工业领域特定场景中的表现平平，因为其缺乏特定领域知识。这一问题受到了广泛关注，但相关基准数据集却很少。本文提供了一个名为MSQA的基准问答（QA）数据集，涉及微软产品和客户遇到的IT技术问题。该数据集包含行业云特定的问答知识，这对于一般LLM来说是不可得的，因此非常适合评估旨在提高LLM特定领域能力的方法。此外，我们提出了一种新的模型交互范式，可以赋予LLM在其不擅长的特定领域任务上取得更好的表现能力。大量实验证明，遵循我们的模型融合框架的方法胜过常用的LLM与检索方法相结合的方式。

English

Large Language Model (LLM) has gained popularity and achieved remarkable results in open-domain tasks, but its performance in real industrial domain-specific scenarios is average since there is no specific knowledge in it. This issue has attracted widespread attention, but there are few relevant benchmarks available. In this paper, we provide a benchmark Question Answering (QA) dataset named MSQA, which is about Microsoft products and IT technical problems encountered by customers. This dataset contains industry cloud-specific QA knowledge, which is not available for general LLM, so it is well suited for evaluating methods aimed at improving domain-specific capabilities of LLM. In addition, we propose a new model interaction paradigm that can empower LLM to achieve better performance on domain-specific tasks where it is not proficient. Extensive experiments demonstrate that the approach following our model fusion framework outperforms the commonly used LLM with retrieval methods.

赋能大型语言模型在工业领域特定问题回答上表现更好

Empower Large Language Model to Perform Better on Industrial Domain-Specific Question Answering

摘要

Support