ChatPaper.aiChatPaper

链式字典提示在大型语言模型中引发翻译

Chain-of-Dictionary Prompting Elicits Translation in Large Language Models

May 11, 2023
作者: Hongyuan Lu, Haoyang Huang, Dongdong Zhang, Haoran Yang, Wai Lam, Furu Wei
cs.AI

摘要

大型语言模型(LLMs)在多语言神经机器翻译(MNMT)中表现出惊人的性能,即使在没有平行数据的情况下进行训练也能取得良好效果。然而,尽管训练数据量庞大,它们仍然在翻译罕见词汇方面遇到困难,尤其是对于低资源语言。更糟糕的是,通常无法为LLMs检索相关示例以进行低资源语言的上下文学习,这限制了LLMs在翻译中的实际应用。我们应该如何缓解这个问题呢?为此,我们提出了一种新方法,即CoD,它利用多语言词典链的先验知识来增强LLMs对部分输入词的翻译能力。大量实验证明,通过将CoD与ChatGPT相结合,MNMT的ChrF++分数可以提高多达13倍(英语到使用西里尔字母表的塞尔维亚语的FLORES-200完整开发测试集从3.08提高到42.63)。我们进一步展示了链式多语言词典的重要性,以及CoD相对于低资源语言的少样本示例的优越性。
English
Large language models (LLMs) have shown surprisingly good performance in multilingual neural machine translation (MNMT) even when trained without parallel data. Yet, despite the fact that the amount of training data is gigantic, they still struggle with translating rare words, particularly for low-resource languages. Even worse, it is usually unrealistic to retrieve relevant demonstrations for in-context learning with low-resource languages on LLMs, which restricts the practical use of LLMs for translation -- how should we mitigate this problem? To this end, we present a novel method, CoD, which augments LLMs with prior knowledge with the chains of multilingual dictionaries for a subset of input words to elicit translation abilities for LLMs. Extensive experiments indicate that augmenting ChatGPT with CoD elicits large gains by up to 13x ChrF++ points for MNMT (3.08 to 42.63 for English to Serbian written in Cyrillic script) on FLORES-200 full devtest set. We further demonstrate the importance of chaining the multilingual dictionaries, as well as the superiority of CoD to few-shot demonstration for low-resource languages.
PDF20December 15, 2024