대규모 언어 모델이 산업 분야 특화 질의응답에서 더 나은 성능을 발휘하도록 강화하기

초록

대형 언어 모델(LLM)은 개방형 도메인 작업에서 인기를 얻고 뛰어난 성과를 거두었지만, 특정 지식이 부족하기 때문에 실제 산업 현장의 도메인 특화 시나리오에서는 평균적인 성능을 보입니다. 이 문제는 광범위한 관심을 받고 있지만, 관련 벤치마크는 거의 없는 실정입니다. 본 논문에서는 마이크로소프트 제품과 고객이 겪는 IT 기술 문제에 관한 질의응답(QA) 데이터셋인 MSQA를 제공합니다. 이 데이터셋은 일반적인 LLM에서는 접할 수 없는 산업 클라우드 특화 QA 지식을 포함하고 있어, LLM의 도메인 특화 능력을 향상시키는 방법을 평가하는 데 적합합니다. 또한, 우리는 LLM이 능숙하지 않은 도메인 특화 작업에서 더 나은 성능을 발휘할 수 있도록 하는 새로운 모델 상호작용 패러다임을 제안합니다. 광범위한 실험을 통해 우리의 모델 융합 프레임워크를 따르는 접근 방식이 일반적으로 사용되는 LLM과 검색 방법을 능가함을 입증했습니다.

English

Large Language Model (LLM) has gained popularity and achieved remarkable results in open-domain tasks, but its performance in real industrial domain-specific scenarios is average since there is no specific knowledge in it. This issue has attracted widespread attention, but there are few relevant benchmarks available. In this paper, we provide a benchmark Question Answering (QA) dataset named MSQA, which is about Microsoft products and IT technical problems encountered by customers. This dataset contains industry cloud-specific QA knowledge, which is not available for general LLM, so it is well suited for evaluating methods aimed at improving domain-specific capabilities of LLM. In addition, we propose a new model interaction paradigm that can empower LLM to achieve better performance on domain-specific tasks where it is not proficient. Extensive experiments demonstrate that the approach following our model fusion framework outperforms the commonly used LLM with retrieval methods.

대규모 언어 모델이 산업 분야 특화 질의응답에서 더 나은 성능을 발휘하도록 강화하기

Empower Large Language Model to Perform Better on Industrial Domain-Specific Question Answering

초록

Support