大規模言語モデルを産業分野固有の質問応答でより優れた性能を発揮させる

要旨

大規模言語モデル（LLM）は、オープンドメインタスクにおいて人気を集め、顕著な成果を上げてきました。しかし、特定の知識が含まれていないため、実際の産業分野におけるドメイン固有のシナリオでの性能は平均的です。この問題は広く注目されていますが、関連するベンチマークはほとんど存在しません。本論文では、Microsoft製品と顧客が遭遇するIT技術問題に関する質問応答（QA）データセット「MSQA」を提供します。このデータセットには、一般的なLLMでは利用できない産業クラウド固有のQA知識が含まれており、LLMのドメイン固有能力を向上させるための手法を評価するのに適しています。さらに、LLMが得意としないドメイン固有タスクにおいて、より優れた性能を発揮できる新しいモデルインタラクションパラダイムを提案します。広範な実験により、我々のモデル融合フレームワークに従ったアプローチが、一般的に使用されるLLMと検索手法を組み合わせた方法を上回ることが実証されています。

English

Large Language Model (LLM) has gained popularity and achieved remarkable results in open-domain tasks, but its performance in real industrial domain-specific scenarios is average since there is no specific knowledge in it. This issue has attracted widespread attention, but there are few relevant benchmarks available. In this paper, we provide a benchmark Question Answering (QA) dataset named MSQA, which is about Microsoft products and IT technical problems encountered by customers. This dataset contains industry cloud-specific QA knowledge, which is not available for general LLM, so it is well suited for evaluating methods aimed at improving domain-specific capabilities of LLM. In addition, we propose a new model interaction paradigm that can empower LLM to achieve better performance on domain-specific tasks where it is not proficient. Extensive experiments demonstrate that the approach following our model fusion framework outperforms the commonly used LLM with retrieval methods.

大規模言語モデルを産業分野固有の質問応答でより優れた性能を発揮させる

Empower Large Language Model to Perform Better on Industrial Domain-Specific Question Answering

要旨

Support