정렬(Alignment)이 LLM의 다국어 능력을 어떻게 강화하는가? 언어 뉴런 관점에서의 분석

초록

다국어 정렬(Multilingual Alignment)은 대형 언어 모델(LLM)의 다국어 능력을 강화하는 효과적이고 대표적인 패러다임으로, 고자원 언어에서 저자원 언어로의 능력 전이를 가능하게 합니다. 한편, 언어 특정 뉴런에 대한 일부 연구는 LLM이 서로 다른 언어를 처리할 때 선택적으로 활성화되는 언어 특정 뉴런이 존재함을 밝혀냈습니다. 이는 다국어 시나리오에서 LLM의 메커니즘을 보다 구체적으로 분석하고 이해할 수 있는 새로운 관점을 제공합니다. 본 연구에서는 더 세분화된 뉴런 식별 알고리즘을 제안하며, 이는 언어 뉴런(언어 특정 뉴런 및 언어 관련 뉴런 포함)과 언어 무관 뉴런을 탐지합니다. 또한, 다양한 유형의 뉴런 분포 특성을 기반으로 LLM의 다국어 추론 내부 프로세스를 네 가지 부분으로 나눕니다: (1) 다국어 이해, (2) 공유 의미 공간 추론, (3) 다국어 출력 공간 변환, (4) 어휘 공간 출력. 추가적으로, 우리는 정렬 전후의 모델을 다양한 유형의 뉴런에 초점을 맞춰 체계적으로 분석합니다. 또한 '자발적 다국어 정렬(Spontaneous Multilingual Alignment)' 현상도 분석합니다. 전반적으로, 본 연구는 다양한 유형의 뉴런을 기반으로 포괄적인 조사를 수행함으로써, 다국어 정렬과 LLM의 다국어 능력을 더 잘 이해하기 위한 실증적 결과와 유용한 통찰을 제공합니다.

English

Multilingual Alignment is an effective and representative paradigm to enhance LLMs' multilingual capabilities, which transfers the capabilities from the high-resource languages to the low-resource languages. Meanwhile, some researches on language-specific neurons reveal that there are language-specific neurons that are selectively activated in LLMs when processing different languages. This provides a new perspective to analyze and understand LLMs' mechanisms more specifically in multilingual scenarios. In this work, we propose a new finer-grained neuron identification algorithm, which detects language neurons~(including language-specific neurons and language-related neurons) and language-agnostic neurons. Furthermore, based on the distributional characteristics of different types of neurons, we divide the LLMs' internal process for multilingual inference into four parts: (1) multilingual understanding, (2) shared semantic space reasoning, (3) multilingual output space transformation, and (4) vocabulary space outputting. Additionally, we systematically analyze the models before and after alignment with a focus on different types of neurons. We also analyze the phenomenon of ''Spontaneous Multilingual Alignment''. Overall, our work conducts a comprehensive investigation based on different types of neurons, providing empirical results and valuable insights for better understanding multilingual alignment and multilingual capabilities of LLMs.

정렬(Alignment)이 LLM의 다국어 능력을 어떻게 강화하는가? 언어 뉴런 관점에서의 분석

How does Alignment Enhance LLMs' Multilingual Capabilities? A Language Neurons Perspective

초록

Support