通过相似度搜索进行上下文示例选择可改善低资源机器翻译。

摘要

生成式大型语言模型（LLMs）具有在上下文中学习的能力，引发了大量关于如何最佳提示模型执行各种自然语言处理任务的研究。本文侧重于机器翻译（MT），这是一个已被证明受益于上下文翻译示例的任务。然而，目前尚未有关于如何最佳选择示例的系统研究，关于基于相似性选择是否比随机选择更有用的结果也不一。我们提供了一项涵盖多个LLMs和多种上下文示例检索策略的研究，比较多语种句子嵌入。我们涵盖了多个语言方向，代表了不同语言资源水平（英语到法语、德语、斯瓦希里语和沃洛夫语）。与先前发表的结果相反，我们发现句子嵌入相似性可以改善机器翻译，尤其是对于低资源语言方向，并讨论了选择池多样性和质量之间的平衡。我们还强调了基于LLMs的机器翻译评估可能存在的问题，并建议采用更适合的评估协议，将COMET指标调整为LLMs评估的评估方法。代码和输出可在https://github.com/ArmelRandy/ICL-MT 免费获取。

English

The ability of generative large language models (LLMs) to perform in-context learning has given rise to a large body of research into how best to prompt models for various natural language processing tasks. In this paper, we focus on machine translation (MT), a task that has been shown to benefit from in-context translation examples. However no systematic studies have been published on how best to select examples, and mixed results have been reported on the usefulness of similarity-based selection over random selection. We provide a study covering multiple LLMs and multiple in-context example retrieval strategies, comparing multilingual sentence embeddings. We cover several language directions, representing different levels of language resourcedness (English into French, German, Swahili and Wolof). Contrarily to previously published results, we find that sentence embedding similarity can improve MT, especially for low-resource language directions, and discuss the balance between selection pool diversity and quality. We also highlight potential problems with the evaluation of LLM-based MT and suggest a more appropriate evaluation protocol, adapting the COMET metric to the evaluation of LLMs. Code and outputs are freely available at https://github.com/ArmelRandy/ICL-MT.

通过相似度搜索进行上下文示例选择可改善低资源机器翻译。

In-Context Example Selection via Similarity Search Improves Low-Resource Machine Translation

摘要

Support