ChatPaper.aiChatPaper

LLaMA博士:通过生成式数据增强改进领域特定问答中的小型语言模型

Dr. LLaMA: Improving Small Language Models in Domain-Specific QA via Generative Data Augmentation

May 12, 2023
作者: Zhen Guo, Peiqi Wang, Yanwei Wang, Shangdi Yu
cs.AI

摘要

大型语言模型(LLMs)在自然语言处理方面取得了重大进展,但随着规模的增大,它们面临着计算开销和效率方面的挑战,特别是在特定领域的任务中。另一方面,小型语言模型(SLMs)在这些任务中常常由于容量有限和训练数据不足而遇到困难。本文介绍了一种名为Dr. LLaMA的方法,通过使用LLMs进行生成式数据增强来改善SLMs,在医学问答任务和PubMedQA数据集上进行重点研究。我们的研究结果表明,LLMs能够有效地优化和丰富现有的问答对,经过微调后,大大较小模型在特定领域的问答数据集上表现出更好的性能。本研究突显了使用LLMs进行特定领域问答的挑战,并提出了可能的研究方向来解决这些限制,最终旨在为专业应用创建更高效、更有能力的模型。我们还为感兴趣的研究人员提供了我们的代码。
English
Large Language Models (LLMs) have made significant strides in natural language processing but face challenges in terms of computational expense and inefficiency as they grow in size, especially in domain-specific tasks. Small Language Models (SLMs), on the other hand, often struggle in these tasks due to limited capacity and training data. In this paper, we introduce Dr. LLaMA, a method for improving SLMs through generative data augmentation using LLMs, focusing on medical question-answering tasks and the PubMedQA dataset. Our findings indicate that LLMs effectively refine and diversify existing question-answer pairs, resulting in improved performance of a much smaller model on domain-specific QA datasets after fine-tuning. This study highlights the challenges of using LLMs for domain-specific question answering and suggests potential research directions to address these limitations, ultimately aiming to create more efficient and capable models for specialized applications. We have also made our code available for interested researchers
PDF21December 15, 2024