ChatPaper.aiChatPaper

MaLA-500:大规模语言模型的大规模语言适应

MaLA-500: Massive Language Adaptation of Large Language Models

January 24, 2024
作者: Peiqin Lin, Shaoxiong Ji, Jörg Tiedemann, André F. T. Martins, Hinrich Schütze
cs.AI

摘要

大型语言模型推动了自然语言处理的最新技术。然而,它们主要设计用于英语或有限的语言集,导致在处理低资源语言时效果不佳。为了弥合这一差距,我们引入了MaLA-500,这是一个新颖的大型语言模型,旨在涵盖534种语言的广泛范围。为了训练MaLA-500,我们采用了词汇扩展并在LLaMA 2上持续预训练,使用Glot500-c。我们在SIB-200上的实验表明,MaLA-500实现了最先进的上下文学习结果。我们在https://huggingface.co/MaLA-LM发布了MaLA-500。
English
Large language models have advanced the state of the art in natural language processing. However, their predominant design for English or a limited set of languages creates a substantial gap in their effectiveness for low-resource languages. To bridge this gap, we introduce MaLA-500, a novel large language model designed to cover an extensive range of 534 languages. To train MaLA-500, we employ vocabulary extension and continued pretraining on LLaMA 2 with Glot500-c. Our experiments on SIB-200 show that MaLA-500 achieves state-of-the-art in-context learning results. We release MaLA-500 at https://huggingface.co/MaLA-LM
PDF131December 15, 2024