Potenzia i Modelli Linguistici di Grande Scala per Migliorare le Prestazioni nel Rispondere a Domande Specifiche del Dominio Industriale

Abstract

I Large Language Model (LLM) hanno guadagnato popolarità e ottenuto risultati notevoli in compiti a dominio aperto, ma le loro prestazioni in scenari industriali specifici sono mediocri poiché non contengono conoscenze specifiche. Questo problema ha attirato un'attenzione diffusa, ma sono disponibili pochi benchmark rilevanti. In questo articolo, forniamo un dataset di benchmark per il Question Answering (QA) denominato MSQA, che riguarda i prodotti Microsoft e i problemi tecnici IT riscontrati dai clienti. Questo dataset contiene conoscenze QA specifiche per il cloud industriale, non disponibili per i LLM generici, quindi è particolarmente adatto per valutare metodi mirati a migliorare le capacità specifiche per dominio dei LLM. Inoltre, proponiamo un nuovo paradigma di interazione tra modelli che può potenziare i LLM per ottenere prestazioni migliori in compiti specifici per dominio in cui non sono esperti. Esperimenti estesi dimostrano che l'approccio che segue il nostro framework di fusione di modelli supera i metodi comunemente utilizzati con LLM e retrieval.

English

Large Language Model (LLM) has gained popularity and achieved remarkable results in open-domain tasks, but its performance in real industrial domain-specific scenarios is average since there is no specific knowledge in it. This issue has attracted widespread attention, but there are few relevant benchmarks available. In this paper, we provide a benchmark Question Answering (QA) dataset named MSQA, which is about Microsoft products and IT technical problems encountered by customers. This dataset contains industry cloud-specific QA knowledge, which is not available for general LLM, so it is well suited for evaluating methods aimed at improving domain-specific capabilities of LLM. In addition, we propose a new model interaction paradigm that can empower LLM to achieve better performance on domain-specific tasks where it is not proficient. Extensive experiments demonstrate that the approach following our model fusion framework outperforms the commonly used LLM with retrieval methods.

Potenzia i Modelli Linguistici di Grande Scala per Migliorare le Prestazioni nel Rispondere a Domande Specifiche del Dominio Industriale

Empower Large Language Model to Perform Better on Industrial Domain-Specific Question Answering

Abstract

Support