ChatPaper.aiChatPaper

mLongT5:一種多語言且高效的文本到文本Transformer,適用於更長的序列

mLongT5: A Multilingual and Efficient Text-To-Text Transformer for Longer Sequences

May 18, 2023
作者: David Uthus, Santiago Ontañón, Joshua Ainslie, Mandy Guo
cs.AI

摘要

我們介紹了我們開發的多語言、高效的文本到文本轉換器,適用於處理長輸入。該模型名為mLongT5,基於LongT5的架構,同時利用了用於預訓練mT5和UL2預訓練任務的多語言數據集。我們對這個模型在各種多語言摘要和問答任務上進行評估,結果顯示mLongT5相較於現有的多語言模型如mBART或M-BERT表現更強。
English
We present our work on developing a multilingual, efficient text-to-text transformer that is suitable for handling long inputs. This model, called mLongT5, builds upon the architecture of LongT5, while leveraging the multilingual datasets used for pretraining mT5 and the pretraining tasks of UL2. We evaluate this model on a variety of multilingual summarization and question-answering tasks, and the results show stronger performance for mLongT5 when compared to existing multilingual models such as mBART or M-BERT.
PDF21December 15, 2024