多语言大语言模型中的语言手术

摘要

大型语言模型（LLMs）在跨任务和跨语言方面展现出了卓越的泛化能力，彻底革新了自然语言处理领域。本文深入探讨了LLMs中自然涌现的表征对齐现象，特别是在中间层，以及其对解耦语言特定与语言无关信息的意义。我们通过实证研究确认了这种对齐的存在，分析了其与显式设计对齐模型相比的行为特征，并展示了其在保持语义完整性的前提下进行语言特定操控的潜力。基于这些发现，我们提出了推理时语言控制（ITLC）这一创新方法，该方法利用潜在注入实现精确的跨语言控制，并减轻LLMs中的语言混淆问题。实验结果表明，ITLC在保持目标语言语义完整性的同时，具备强大的跨语言控制能力。此外，我们还验证了其在缓解跨语言混淆问题上的有效性，该问题即使在当前大规模LLMs中依然存在，导致语言生成的不一致性。本研究深化了我们对LLMs表征对齐的理解，并为提升其跨语言性能提供了实用解决方案。

English

Large Language Models (LLMs) have demonstrated remarkable generalization capabilities across tasks and languages, revolutionizing natural language processing. This paper investigates the naturally emerging representation alignment in LLMs, particularly in the middle layers, and its implications for disentangling language-specific and language-agnostic information. We empirically confirm the existence of this alignment, analyze its behavior in comparison to explicitly designed alignment models, and demonstrate its potential for language-specific manipulation without semantic degradation. Building on these findings, we propose Inference-Time Language Control (ITLC), a novel method that leverages latent injection to enable precise cross-lingual language control and mitigate language confusion in LLMs. Our experiments highlight ITLC's strong cross-lingual control capabilities while preserving semantic integrity in target languages. Furthermore, we demonstrate its effectiveness in alleviating the cross-lingual language confusion problem, which persists even in current large-scale LLMs, leading to inconsistent language generation. This work advances our understanding of representation alignment in LLMs and introduces a practical solution for enhancing their cross-lingual performance.

多语言大语言模型中的语言手术

Language Surgery in Multilingual Large Language Models

摘要

Support