アラインメントはどのようにLLMの多言語能力を強化するのか？言語ニューロンの観点から

要旨

多言語アラインメントは、大規模言語モデル（LLMs）の多言語能力を強化するための効果的で代表的なパラダイムであり、高リソース言語から低リソース言語への能力転移を実現します。一方で、言語特異的ニューロンに関する研究から、LLMsが異なる言語を処理する際に選択的に活性化される言語特異的ニューロンが存在することが明らかになっています。これは、LLMsのメカニズムを多言語シナリオにおいてより具体的に分析・理解するための新たな視点を提供します。本研究では、新しい細粒度のニューロン識別アルゴリズムを提案し、言語ニューロン（言語特異的ニューロンおよび言語関連ニューロン）と言語非依存ニューロンを検出します。さらに、異なるタイプのニューロンの分布特性に基づいて、LLMsの多言語推論における内部プロセスを以下の4つの部分に分割します：（1）多言語理解、（2）共有意味空間推論、（3）多言語出力空間変換、（4）語彙空間出力。加えて、アラインメント前後のモデルを異なるタイプのニューロンに焦点を当てて系統的に分析し、「自発的多言語アラインメント」の現象についても分析します。全体として、本研究は異なるタイプのニューロンに基づいた包括的な調査を行い、多言語アラインメントおよびLLMsの多言語能力をより深く理解するための実証結果と貴重な知見を提供します。

English

Multilingual Alignment is an effective and representative paradigm to enhance LLMs' multilingual capabilities, which transfers the capabilities from the high-resource languages to the low-resource languages. Meanwhile, some researches on language-specific neurons reveal that there are language-specific neurons that are selectively activated in LLMs when processing different languages. This provides a new perspective to analyze and understand LLMs' mechanisms more specifically in multilingual scenarios. In this work, we propose a new finer-grained neuron identification algorithm, which detects language neurons~(including language-specific neurons and language-related neurons) and language-agnostic neurons. Furthermore, based on the distributional characteristics of different types of neurons, we divide the LLMs' internal process for multilingual inference into four parts: (1) multilingual understanding, (2) shared semantic space reasoning, (3) multilingual output space transformation, and (4) vocabulary space outputting. Additionally, we systematically analyze the models before and after alignment with a focus on different types of neurons. We also analyze the phenomenon of ''Spontaneous Multilingual Alignment''. Overall, our work conducts a comprehensive investigation based on different types of neurons, providing empirical results and valuable insights for better understanding multilingual alignment and multilingual capabilities of LLMs.

アラインメントはどのようにLLMの多言語能力を強化するのか？言語ニューロンの観点から

How does Alignment Enhance LLMs' Multilingual Capabilities? A Language Neurons Perspective

要旨

Support