Dr. LLaMA：生成的なデータ拡張によるドメイン固有QAにおける小型言語モデルの改善

要旨

大規模言語モデル（LLMs）は自然言語処理において大きな進歩を遂げてきたが、特にドメイン固有のタスクにおいて、モデルサイズが大きくなるにつれて計算コストと非効率性という課題に直面している。一方、小規模言語モデル（SLMs）は、容量とトレーニングデータの制限により、これらのタスクで苦戦することが多い。本論文では、LLMsを用いた生成的データ拡張を通じてSLMsを改善する手法「Dr. LLaMA」を紹介し、医療質問応答タスクとPubMedQAデータセットに焦点を当てる。我々の研究結果は、LLMsが既存の質問応答ペアを洗練し多様化することで、ファインチューニング後のドメイン固有のQAデータセットにおいて、はるかに小規模なモデルの性能が向上することを示している。本研究は、ドメイン固有の質問応答におけるLLMsの使用に関する課題を浮き彫りにし、これらの制限に対処するための潜在的な研究方向性を示唆することで、専門的なアプリケーションのためのより効率的で能力の高いモデルの作成を目指している。また、興味のある研究者向けにコードを公開している。

English

Large Language Models (LLMs) have made significant strides in natural language processing but face challenges in terms of computational expense and inefficiency as they grow in size, especially in domain-specific tasks. Small Language Models (SLMs), on the other hand, often struggle in these tasks due to limited capacity and training data. In this paper, we introduce Dr. LLaMA, a method for improving SLMs through generative data augmentation using LLMs, focusing on medical question-answering tasks and the PubMedQA dataset. Our findings indicate that LLMs effectively refine and diversify existing question-answer pairs, resulting in improved performance of a much smaller model on domain-specific QA datasets after fine-tuning. This study highlights the challenges of using LLMs for domain-specific question answering and suggests potential research directions to address these limitations, ultimately aiming to create more efficient and capable models for specialized applications. We have also made our code available for interested researchers

Dr. LLaMA：生成的なデータ拡張によるドメイン固有QAにおける小型言語モデルの改善

Dr. LLaMA: Improving Small Language Models in Domain-Specific QA via Generative Data Augmentation

要旨

Support