迷失於直譯:監督式訓練如何塑造大型語言模型中的翻譯腔
Lost in Literalism: How Supervised Training Shapes Translationese in LLMs
March 6, 2025
作者: Yafu Li, Ronghao Zhang, Zhilin Wang, Huajian Zhang, Leyang Cui, Yongjing Yin, Tong Xiao, Yue Zhang
cs.AI
摘要
大型語言模型(LLMs)在機器翻譯領域取得了顯著成就,展現出跨越多種語言的卓越性能。然而,翻譯腔——以過於直譯和不自然的翻譯為特徵——仍然是基於LLM的翻譯系統中持續存在的挑戰。儘管LLMs在大量自然語料上進行了預訓練,但它們仍會出現翻譯腔錯誤,並生成意料之外的不自然翻譯,這源於監督微調(SFT)過程中引入的偏差。在本研究中,我們系統地評估了LLM生成翻譯中翻譯腔的普遍性,並探討了其在監督訓練中的根源。我們提出了多種方法來減輕這些偏差,包括打磨黃金參考譯文和過濾不自然的訓練實例。實證評估表明,這些方法顯著減少了翻譯腔,同時提升了翻譯的自然度,這通過人工評估和自動指標得到了驗證。我們的研究結果強調了在訓練過程中進行調整以優化LLM翻譯輸出的必要性,為實現更流暢且符合目標語言習慣的翻譯鋪平了道路。我們在https://github.com/yafuly/LLM_Translationese上公開了數據和代碼。
English
Large language models (LLMs) have achieved remarkable success in machine
translation, demonstrating impressive performance across diverse languages.
However, translationese, characterized by overly literal and unnatural
translations, remains a persistent challenge in LLM-based translation systems.
Despite their pre-training on vast corpora of natural utterances, LLMs exhibit
translationese errors and generate unexpected unnatural translations, stemming
from biases introduced during supervised fine-tuning (SFT). In this work, we
systematically evaluate the prevalence of translationese in LLM-generated
translations and investigate its roots during supervised training. We introduce
methods to mitigate these biases, including polishing golden references and
filtering unnatural training instances. Empirical evaluations demonstrate that
these approaches significantly reduce translationese while improving
translation naturalness, validated by human evaluations and automatic metrics.
Our findings highlight the need for training-aware adjustments to optimize LLM
translation outputs, paving the way for more fluent and
target-language-consistent translations. We release the data and code at
https://github.com/yafuly/LLM_Translationese.Summary
AI-Generated Summary