ChatPaper.aiChatPaper

字典鏈提示引發大型語言模型的翻譯

Chain-of-Dictionary Prompting Elicits Translation in Large Language Models

May 11, 2023
作者: Hongyuan Lu, Haoyang Huang, Dongdong Zhang, Haoran Yang, Wai Lam, Furu Wei
cs.AI

摘要

大型語言模型(LLMs)在多語言神經機器翻譯(MNMT)中展現出令人驚訝的良好性能,即使在沒有平行數據的情況下進行訓練。然而,儘管訓練數據量巨大,它們仍然在翻譯罕見詞彙方面遇到困難,特別是對於低資源語言。更糟糕的是,對於低資源語言,通常無法檢索相關示範以進行上下文學習,這限制了LLMs在翻譯方面的實際應用 - 我們應該如何緩解這個問題?為此,我們提出了一種新方法,稱為CoD,它通過多語言詞典鏈的先前知識來擴充LLMs的部分輸入詞彙,以引出LLMs的翻譯能力。廣泛的實驗表明,使用CoD擴充ChatGPT可以使MNMT的ChrF++分數大幅提高,最多可達13倍(從3.08增至42.63,用於西里爾文塞爾維亞語的英語)在FLORES-200完整開發測試集上。我們進一步展示了鏈接多語言詞典的重要性,以及CoD相對於低資源語言的少樣本示範的優越性。
English
Large language models (LLMs) have shown surprisingly good performance in multilingual neural machine translation (MNMT) even when trained without parallel data. Yet, despite the fact that the amount of training data is gigantic, they still struggle with translating rare words, particularly for low-resource languages. Even worse, it is usually unrealistic to retrieve relevant demonstrations for in-context learning with low-resource languages on LLMs, which restricts the practical use of LLMs for translation -- how should we mitigate this problem? To this end, we present a novel method, CoD, which augments LLMs with prior knowledge with the chains of multilingual dictionaries for a subset of input words to elicit translation abilities for LLMs. Extensive experiments indicate that augmenting ChatGPT with CoD elicits large gains by up to 13x ChrF++ points for MNMT (3.08 to 42.63 for English to Serbian written in Cyrillic script) on FLORES-200 full devtest set. We further demonstrate the importance of chaining the multilingual dictionaries, as well as the superiority of CoD to few-shot demonstration for low-resource languages.
PDF20December 15, 2024