通過相似度搜索進行上下文示例選擇可改善低資源機器翻譯。
In-Context Example Selection via Similarity Search Improves Low-Resource Machine Translation
August 1, 2024
作者: Armel Zebaze, Benoît Sagot, Rachel Bawden
cs.AI
摘要
生成式大型語言模型(LLM)在上下文學習方面的能力已經引發了大量研究,探討如何最佳提示模型進行各種自然語言處理任務。本文專注於機器翻譯(MT),這是一個已被證明可以從上下文翻譯示例中受益的任務。然而,目前尚未有系統性研究發表關於如何最佳選擇示例,並且有關基於相似性選擇是否優於隨機選擇的效用的報告結果不一。我們提供了一項研究,涵蓋多個LLM和多個上下文示例檢索策略,並比較多語句嵌入。我們涵蓋了多種語言方向,代表不同程度的語言資源(英語到法語、德語、斯瓦希里語和沃洛夫語)。與先前發表的結果相反,我們發現句子嵌入相似性可以改善機器翻譯,特別是對於資源較少的語言方向,並討論選擇池多樣性和質量之間的平衡。我們還強調了基於LLM的機器翻譯評估可能存在的問題,並建議一種更適當的評估協議,將COMET指標適應到LLM的評估中。代碼和輸出可在https://github.com/ArmelRandy/ICL-MT 免費獲得。
English
The ability of generative large language models (LLMs) to perform in-context
learning has given rise to a large body of research into how best to prompt
models for various natural language processing tasks. In this paper, we focus
on machine translation (MT), a task that has been shown to benefit from
in-context translation examples. However no systematic studies have been
published on how best to select examples, and mixed results have been reported
on the usefulness of similarity-based selection over random selection. We
provide a study covering multiple LLMs and multiple in-context example
retrieval strategies, comparing multilingual sentence embeddings. We cover
several language directions, representing different levels of language
resourcedness (English into French, German, Swahili and Wolof). Contrarily to
previously published results, we find that sentence embedding similarity can
improve MT, especially for low-resource language directions, and discuss the
balance between selection pool diversity and quality. We also highlight
potential problems with the evaluation of LLM-based MT and suggest a more
appropriate evaluation protocol, adapting the COMET metric to the evaluation of
LLMs. Code and outputs are freely available at
https://github.com/ArmelRandy/ICL-MT.Summary
AI-Generated Summary