BigTrans:利用多語言翻譯增強大型語言模型,支援超過100種語言。
BigTrans: Augmenting Large Language Models with Multilingual Translation Capability over 100 Languages
May 29, 2023
作者: Wen Yang, Chong Li, Jiajun Zhang, Chengqing Zong
cs.AI
摘要
大型語言模型(LLMs)展示了在各種自然語言中具有良好翻譯表現的潛力。然而,許多LLMs,特別是像BLOOM和LLaMA這樣的開源模型,主要以英語為主,僅支援數十種自然語言,導致LLMs在語言翻譯方面的潛力尚未被充分探索。在這項研究中,我們提出了BigTrans,它是在LLaMA的基礎上進行了擴展,原本只支援20種語言的LLaMA現在具備了在100多種語言上進行多語言翻譯的能力。BigTrans是基於LLaMA-13B構建的,並經過三個步驟的優化。首先,我們使用大量的中文單語數據繼續訓練LLaMA。其次,我們使用一個包含102種自然語言的大規模平行數據集繼續訓練模型。第三,我們使用多語言翻譯指導對基礎模型進行微調,從而得到我們的BigTrans模型。對多語言翻譯的初步實驗顯示,BigTrans在許多語言上的表現與ChatGPT和Google翻譯相當,甚至在8種語言對中超越了ChatGPT。我們釋出了BigTrans模型,希望它能推動研究進展。
English
Large language models (LLMs) demonstrate promising translation performance
among various natural languages. However, many LLMs especially the open-sourced
ones, such as BLOOM and LLaMA, are English-dominant and support only dozens of
natural languages, making the potential of LLMs on language translation less
explored. In this work, we present BigTrans which adapts LLaMA that covers only
20 languages and enhances it with multilingual translation capability on more
than 100 languages. BigTrans is built upon LLaMA-13B and it is optimized in
three steps. First, we continue training LLaMA with massive Chinese monolingual
data. Second, we continue training the model with a large-scale parallel
dataset that covers 102 natural languages. Third, we instruct-tune the
foundation model with multilingual translation instructions, leading to our
BigTrans model. The preliminary experiments on multilingual translation show
that BigTrans performs comparably with ChatGPT and Google Translate in many
languages and even outperforms ChatGPT in 8 language pairs. We release the
BigTrans model and hope it can advance the research progress.