ChatPaper.aiChatPaper

BioMamba:一种基于预训练的生物医学语言表示模型,利用Mamba

BioMamba: A Pre-trained Biomedical Language Representation Model Leveraging Mamba

August 5, 2024
作者: Ling Yue, Sixue Xing, Yingzhou Lu, Tianfan Fu
cs.AI

摘要

生物学中自然语言处理(NLP)的进展取决于模型解释复杂生物医学文献的能力。传统模型通常在这一领域复杂且特定的语言中遇到困难。本文介绍了BioMamba,这是一个专门为生物医学文本挖掘设计的预训练模型。BioMamba基于Mamba架构,经过大量生物医学文献的预训练。我们的实证研究表明,BioMamba在各种生物医学任务中明显优于BioBERT和通用领域的Mamba等模型。例如,BioMamba在BioASQ测试集上实现了100倍困惑度的降低和4倍交叉熵损失的降低。我们概述了模型架构、预训练过程和微调技术。此外,我们发布了代码和训练模型,以促进进一步研究。
English
The advancement of natural language processing (NLP) in biology hinges on models' ability to interpret intricate biomedical literature. Traditional models often struggle with the complex and domain-specific language in this field. In this paper, we present BioMamba, a pre-trained model specifically designed for biomedical text mining. BioMamba builds upon the Mamba architecture and is pre-trained on an extensive corpus of biomedical literature. Our empirical studies demonstrate that BioMamba significantly outperforms models like BioBERT and general-domain Mamba across various biomedical tasks. For instance, BioMamba achieves a 100 times reduction in perplexity and a 4 times reduction in cross-entropy loss on the BioASQ test set. We provide an overview of the model architecture, pre-training process, and fine-tuning techniques. Additionally, we release the code and trained model to facilitate further research.

Summary

AI-Generated Summary

PDF112November 28, 2024