類似性検索による文脈内事例選択は低リソース機械翻訳を改善する

要旨

生成型大規模言語モデル（LLM）がコンテキスト内学習を実行する能力は、様々な自然言語処理タスクにおいてモデルを最適にプロンプトする方法に関する多くの研究を引き起こしてきました。本論文では、機械翻訳（MT）に焦点を当てます。このタスクは、コンテキスト内の翻訳例から恩恵を受けることが示されています。しかし、最適な例の選択方法に関する体系的な研究は発表されておらず、類似性に基づく選択がランダム選択よりも有用であるかどうかについても、結果が混在しています。我々は、複数のLLMと複数のコンテキスト内例検索戦略をカバーし、多言語文埋め込みを比較する研究を提供します。いくつかの言語方向（英語からフランス語、ドイツ語、スワヒリ語、ウォロフ語）をカバーし、異なるレベルの言語リソースの豊富さを表しています。以前に発表された結果とは異なり、文埋め込みの類似性がMTを改善できること、特に低リソース言語方向において有効であることを発見し、選択プールの多様性と品質のバランスについて議論します。また、LLMベースのMTの評価における潜在的な問題を指摘し、COMETメトリックをLLMの評価に適応させたより適切な評価プロトコルを提案します。コードと出力はhttps://github.com/ArmelRandy/ICL-MTで自由に利用可能です。

English

The ability of generative large language models (LLMs) to perform in-context learning has given rise to a large body of research into how best to prompt models for various natural language processing tasks. In this paper, we focus on machine translation (MT), a task that has been shown to benefit from in-context translation examples. However no systematic studies have been published on how best to select examples, and mixed results have been reported on the usefulness of similarity-based selection over random selection. We provide a study covering multiple LLMs and multiple in-context example retrieval strategies, comparing multilingual sentence embeddings. We cover several language directions, representing different levels of language resourcedness (English into French, German, Swahili and Wolof). Contrarily to previously published results, we find that sentence embedding similarity can improve MT, especially for low-resource language directions, and discuss the balance between selection pool diversity and quality. We also highlight potential problems with the evaluation of LLM-based MT and suggest a more appropriate evaluation protocol, adapting the COMET metric to the evaluation of LLMs. Code and outputs are freely available at https://github.com/ArmelRandy/ICL-MT.

類似性検索による文脈内事例選択は低リソース機械翻訳を改善する

In-Context Example Selection via Similarity Search Improves Low-Resource Machine Translation

要旨

Support