リテラリズムに迷い込む：教師あり学習が大規模言語モデルにおける翻訳調を形成する仕組み

要旨

大規模言語モデル（LLM）は機械翻訳において顕著な成功を収め、多様な言語で印象的な性能を発揮しています。しかし、過度に逐語的で不自然な翻訳を特徴とする「翻訳調」は、LLMベースの翻訳システムにおいて依然として根強い課題です。LLMは自然な発話の大規模なコーパスで事前学習されているにもかかわらず、翻訳調のエラーを引き起こし、教師ありファインチューニング（SFT）中に導入されたバイアスに起因する予期せぬ不自然な翻訳を生成します。本研究では、LLMが生成する翻訳における翻訳調の普及度を体系的に評価し、教師あり学習中のその根源を調査します。私たちは、ゴールデンリファレンスの洗練や不自然な訓練インスタンスのフィルタリングを含む、これらのバイアスを軽減する方法を導入します。実証評価により、これらのアプローチが翻訳調を大幅に減少させ、翻訳の自然さを向上させることが示され、人間による評価と自動メトリクスによって検証されました。私たちの知見は、LLM翻訳出力を最適化するための訓練を意識した調整の必要性を強調し、より流暢でターゲット言語に一貫した翻訳への道を開きます。データとコードはhttps://github.com/yafuly/LLM_Translationeseで公開しています。

English

Large language models (LLMs) have achieved remarkable success in machine translation, demonstrating impressive performance across diverse languages. However, translationese, characterized by overly literal and unnatural translations, remains a persistent challenge in LLM-based translation systems. Despite their pre-training on vast corpora of natural utterances, LLMs exhibit translationese errors and generate unexpected unnatural translations, stemming from biases introduced during supervised fine-tuning (SFT). In this work, we systematically evaluate the prevalence of translationese in LLM-generated translations and investigate its roots during supervised training. We introduce methods to mitigate these biases, including polishing golden references and filtering unnatural training instances. Empirical evaluations demonstrate that these approaches significantly reduce translationese while improving translation naturalness, validated by human evaluations and automatic metrics. Our findings highlight the need for training-aware adjustments to optimize LLM translation outputs, paving the way for more fluent and target-language-consistent translations. We release the data and code at https://github.com/yafuly/LLM_Translationese.

リテラリズムに迷い込む：教師あり学習が大規模言語モデルにおける翻訳調を形成する仕組み

Lost in Literalism: How Supervised Training Shapes Translationese in LLMs

要旨

Support