多語種大型語言模型中的語言手術

摘要

大型語言模型（LLMs）在跨任務與跨語言的泛化能力上展現了顯著的成就，革新了自然語言處理領域。本文探討了LLMs中自然湧現的表徵對齊現象，特別是在中間層次，及其對解耦語言特定與語言無關信息的意義。我們實證確認了此對齊的存在，並與專門設計的對齊模型進行行為比較分析，展示了其在保持語義不降級的前提下進行語言特定操控的潛力。基於這些發現，我們提出了推理時語言控制（ITLC），一種新穎的方法，利用潛在注入實現精確的跨語言控制，並減輕LLMs中的語言混淆問題。我們的實驗突顯了ITLC在保持目標語言語義完整性的同時，具備強大的跨語言控制能力。此外，我們展示了其在緩解跨語言混淆問題上的有效性，該問題即便在當前大規模LLMs中仍持續存在，導致語言生成的不一致性。本研究增進了我們對LLMs中表徵對齊的理解，並為提升其跨語言性能提供了一種實用解決方案。

English

Large Language Models (LLMs) have demonstrated remarkable generalization capabilities across tasks and languages, revolutionizing natural language processing. This paper investigates the naturally emerging representation alignment in LLMs, particularly in the middle layers, and its implications for disentangling language-specific and language-agnostic information. We empirically confirm the existence of this alignment, analyze its behavior in comparison to explicitly designed alignment models, and demonstrate its potential for language-specific manipulation without semantic degradation. Building on these findings, we propose Inference-Time Language Control (ITLC), a novel method that leverages latent injection to enable precise cross-lingual language control and mitigate language confusion in LLMs. Our experiments highlight ITLC's strong cross-lingual control capabilities while preserving semantic integrity in target languages. Furthermore, we demonstrate its effectiveness in alleviating the cross-lingual language confusion problem, which persists even in current large-scale LLMs, leading to inconsistent language generation. This work advances our understanding of representation alignment in LLMs and introduces a practical solution for enhancing their cross-lingual performance.

多語種大型語言模型中的語言手術

Language Surgery in Multilingual Large Language Models

摘要

Support