MaLA-500:大型語言模型的大規模語言適應
MaLA-500: Massive Language Adaptation of Large Language Models
January 24, 2024
作者: Peiqin Lin, Shaoxiong Ji, Jörg Tiedemann, André F. T. Martins, Hinrich Schütze
cs.AI
摘要
大型語言模型已在自然語言處理的最新技術中取得了進展。然而,它們主要設計用於英語或有限的一組語言,導致在低資源語言的效果上存在顯著差距。為了彌合這一差距,我們介紹了MaLA-500,這是一個新穎的大型語言模型,旨在涵蓋534種語言的廣泛範圍。為了訓練MaLA-500,我們採用詞彙擴展和在LLaMA 2上持續預訓練,使用Glot500-c。我們在SIB-200上的實驗表明,MaLA-500實現了最先進的上下文學習結果。我們在https://huggingface.co/MaLA-LM 上釋出了MaLA-500。
English
Large language models have advanced the state of the art in natural language
processing. However, their predominant design for English or a limited set of
languages creates a substantial gap in their effectiveness for low-resource
languages. To bridge this gap, we introduce MaLA-500, a novel large language
model designed to cover an extensive range of 534 languages. To train MaLA-500,
we employ vocabulary extension and continued pretraining on LLaMA 2 with
Glot500-c. Our experiments on SIB-200 show that MaLA-500 achieves
state-of-the-art in-context learning results. We release MaLA-500 at
https://huggingface.co/MaLA-LM