MaLA-500：大规模语言模型的大规模语言适应

摘要

大型语言模型推动了自然语言处理的最新技术。然而，它们主要设计用于英语或有限的语言集，导致在处理低资源语言时效果不佳。为了弥合这一差距，我们引入了MaLA-500，这是一个新颖的大型语言模型，旨在涵盖534种语言的广泛范围。为了训练MaLA-500，我们采用了词汇扩展并在LLaMA 2上持续预训练，使用Glot500-c。我们在SIB-200上的实验表明，MaLA-500实现了最先进的上下文学习结果。我们在https://huggingface.co/MaLA-LM发布了MaLA-500。

English

Large language models have advanced the state of the art in natural language processing. However, their predominant design for English or a limited set of languages creates a substantial gap in their effectiveness for low-resource languages. To bridge this gap, we introduce MaLA-500, a novel large language model designed to cover an extensive range of 534 languages. To train MaLA-500, we employ vocabulary extension and continued pretraining on LLaMA 2 with Glot500-c. Our experiments on SIB-200 show that MaLA-500 achieves state-of-the-art in-context learning results. We release MaLA-500 at https://huggingface.co/MaLA-LM

MaLA-500：大规模语言模型的大规模语言适应

MaLA-500: Massive Language Adaptation of Large Language Models

摘要

Support