mLongT5:一种用于处理更长序列的多语言高效文本到文本的Transformer
mLongT5: A Multilingual and Efficient Text-To-Text Transformer for Longer Sequences
May 18, 2023
作者: David Uthus, Santiago Ontañón, Joshua Ainslie, Mandy Guo
cs.AI
摘要
我们介绍了我们开发的一种多语言、高效的文本到文本转换器,适用于处理长输入。这个模型被称为mLongT5,它基于LongT5的架构,同时利用了用于预训练mT5和UL2预训练任务的多语言数据集。我们在各种多语言摘要和问答任务上评估了这个模型,结果显示mLongT5相对于现有的多语言模型如mBART或M-BERT表现更强。
English
We present our work on developing a multilingual, efficient text-to-text
transformer that is suitable for handling long inputs. This model, called
mLongT5, builds upon the architecture of LongT5, while leveraging the
multilingual datasets used for pretraining mT5 and the pretraining tasks of
UL2. We evaluate this model on a variety of multilingual summarization and
question-answering tasks, and the results show stronger performance for mLongT5
when compared to existing multilingual models such as mBART or M-BERT.