다언어 대규모 언어 모델에서의 언어 수술

초록

대규모 언어 모델(LLMs)은 다양한 작업과 언어에 걸쳐 놀라운 일반화 능력을 보여주며, 자연어 처리 분야에 혁신을 가져왔다. 본 논문은 LLMs의 중간 계층에서 자연스럽게 발생하는 표현 정렬(representation alignment)과 이를 통해 언어 특정적 정보와 언어 독립적 정보를 분리하는 데 대한 함의를 탐구한다. 우리는 이러한 정렬의 존재를 실증적으로 확인하고, 명시적으로 설계된 정렬 모델과의 비교를 통해 그 특성을 분석하며, 의미적 저하 없이 언어 특정적 조작을 가능하게 하는 잠재력을 입증한다. 이러한 연구 결과를 바탕으로, 우리는 잠재 주입(latent injection)을 활용하여 정확한 교차 언어 제어를 가능하게 하고 LLMs의 언어 혼동을 완화하는 새로운 방법인 추론 시 언어 제어(Inference-Time Language Control, ITLC)를 제안한다. 실험 결과, ITLC는 목표 언어의 의미적 무결성을 유지하면서도 강력한 교차 언어 제어 능력을 보여준다. 또한, 현재의 대규모 LLMs에서도 지속되는 교차 언어 혼동 문제를 완화하는 데 효과적임을 입증하며, 이는 일관되지 않은 언어 생성을 초래하는 문제를 해결한다. 본 연구는 LLMs의 표현 정렬에 대한 이해를 진전시키고, 교차 언어 성능을 향상시키기 위한 실용적인 해결책을 제시한다.

English

Large Language Models (LLMs) have demonstrated remarkable generalization capabilities across tasks and languages, revolutionizing natural language processing. This paper investigates the naturally emerging representation alignment in LLMs, particularly in the middle layers, and its implications for disentangling language-specific and language-agnostic information. We empirically confirm the existence of this alignment, analyze its behavior in comparison to explicitly designed alignment models, and demonstrate its potential for language-specific manipulation without semantic degradation. Building on these findings, we propose Inference-Time Language Control (ITLC), a novel method that leverages latent injection to enable precise cross-lingual language control and mitigate language confusion in LLMs. Our experiments highlight ITLC's strong cross-lingual control capabilities while preserving semantic integrity in target languages. Furthermore, we demonstrate its effectiveness in alleviating the cross-lingual language confusion problem, which persists even in current large-scale LLMs, leading to inconsistent language generation. This work advances our understanding of representation alignment in LLMs and introduces a practical solution for enhancing their cross-lingual performance.

다언어 대규모 언어 모델에서의 언어 수술

Language Surgery in Multilingual Large Language Models

초록

Support