InkubaLM:面向低资源非洲语言的小型语言模型
InkubaLM: A small language model for low-resource African languages
August 30, 2024
作者: Atnafu Lambebo Tonja, Bonaventure F. P. Dossou, Jessica Ojo, Jenalea Rajab, Fadel Thior, Eric Peter Wairagala, Aremu Anuoluwapo, Pelonomi Moiloa, Jade Abbott, Vukosi Marivate, Benjamin Rosman
cs.AI
摘要
在非洲背景下,高资源语言模型往往无法满足需求,需要高效、易获取且本地相关的模型,尽管面临着计算和数据限制。本文介绍了InkubaLM,一个拥有0.4亿参数的小型语言模型,其在机器翻译、问答、AfriMMLU和AfriXnli等任务上取得了与参数数量显著更多、训练数据更丰富的模型相媲美的性能。值得注意的是,InkubaLM在情感分析方面胜过许多更大的模型,并在多种语言上展现出卓越的一致性。这项工作代表了挑战传统范式的重要进展,即有效的语言模型必须依赖大量资源。我们的模型和数据集可公开获取\url{https://huggingface.co/lelapa},以促进对低资源语言的研究和开发。
English
High-resource language models often fall short in the African context, where
there is a critical need for models that are efficient, accessible, and locally
relevant, even amidst significant computing and data constraints. This paper
introduces InkubaLM, a small language model with 0.4 billion parameters, which
achieves performance comparable to models with significantly larger parameter
counts and more extensive training data on tasks such as machine translation,
question-answering, AfriMMLU, and the AfriXnli task. Notably, InkubaLM
outperforms many larger models in sentiment analysis and demonstrates
remarkable consistency across multiple languages. This work represents a
pivotal advancement in challenging the conventional paradigm that effective
language models must rely on substantial resources. Our model and datasets are
publicly available \url{https://huggingface.co/lelapa} to encourage
research and development on low-resource languages.Summary
AI-Generated Summary