在GPT中对上下文学习翻译的解剖
Dissecting In-Context Learning of Translations in GPTs
October 24, 2023
作者: Vikas Raunak, Hany Hassan Awadalla, Arul Menezes
cs.AI
摘要
近期大部分关于利用大型语言模型(LLMs)如GPT-3进行机器翻译(MT)的研究集中在选择少样本用于提示。在这项工作中,我们尝试更好地理解演示属性在通过扰动高质量、领域内演示进行上下文学习翻译中的作用。我们发现对源-目标映射进行的非对称扰动会产生截然不同的结果。我们表明,源端的扰动影响很小,而目标端的扰动可以大幅降低翻译质量,这表明在上下文学习翻译过程中,输出文本分布提供了最重要的学习信号。我们提出了一种名为Zero-Shot-Context的方法,用于在零样本提示中自动添加这个信号。我们证明这一方法改善了GPT-3的零样本翻译性能,甚至使其与少样本提示的翻译性能相媲美。
English
Most of the recent work in leveraging Large Language Models (LLMs) such as
GPT-3 for Machine Translation (MT) has focused on selecting the few-shot
samples for prompting. In this work, we try to better understand the role of
demonstration attributes for the in-context learning of translations through
perturbations of high-quality, in-domain demonstrations. We find that
asymmetric perturbation of the source-target mappings yield vastly different
results. We show that the perturbation of the source side has surprisingly
little impact, while target perturbation can drastically reduce translation
quality, suggesting that it is the output text distribution that provides the
most important learning signal during in-context learning of translations. We
propose a method named Zero-Shot-Context to add this signal automatically in
Zero-Shot prompting. We demonstrate that it improves upon the zero-shot
translation performance of GPT-3, even making it competitive with few-shot
prompted translations.