對齊如何提升大型語言模型的多語言能力？從語言神經元視角探討

摘要

多語言對齊是一種有效且具代表性的範式，用以增強大型語言模型（LLMs）的多語言能力，它將高資源語言的效能轉移至低資源語言。同時，一些關於語言特定神經元的研究揭示，LLMs在處理不同語言時，會選擇性地激活這些語言特定的神經元。這為更精細地分析和理解LLMs在多語言情境下的運作機制提供了新的視角。在本研究中，我們提出了一種新的、更細粒度的神經元識別算法，該算法能夠檢測語言神經元（包括語言特定神經元和語言相關神經元）以及語言無關神經元。此外，基於不同類型神經元的分佈特徵，我們將LLMs的多語言推理內部過程劃分為四個部分：（1）多語言理解，（2）共享語義空間推理，（3）多語言輸出空間轉換，以及（4）詞彙空間輸出。另外，我們系統地分析了對齊前後的模型，重點關注不同類型的神經元。我們還分析了「自發多語言對齊」現象。總體而言，我們的工作基於不同類型的神經元進行了全面的調查，為更好地理解LLMs的多語言對齊和多語言能力提供了實證結果和寶貴見解。

English

Multilingual Alignment is an effective and representative paradigm to enhance LLMs' multilingual capabilities, which transfers the capabilities from the high-resource languages to the low-resource languages. Meanwhile, some researches on language-specific neurons reveal that there are language-specific neurons that are selectively activated in LLMs when processing different languages. This provides a new perspective to analyze and understand LLMs' mechanisms more specifically in multilingual scenarios. In this work, we propose a new finer-grained neuron identification algorithm, which detects language neurons~(including language-specific neurons and language-related neurons) and language-agnostic neurons. Furthermore, based on the distributional characteristics of different types of neurons, we divide the LLMs' internal process for multilingual inference into four parts: (1) multilingual understanding, (2) shared semantic space reasoning, (3) multilingual output space transformation, and (4) vocabulary space outputting. Additionally, we systematically analyze the models before and after alignment with a focus on different types of neurons. We also analyze the phenomenon of ''Spontaneous Multilingual Alignment''. Overall, our work conducts a comprehensive investigation based on different types of neurons, providing empirical results and valuable insights for better understanding multilingual alignment and multilingual capabilities of LLMs.

對齊如何提升大型語言模型的多語言能力？從語言神經元視角探討

How does Alignment Enhance LLMs' Multilingual Capabilities? A Language Neurons Perspective

摘要

Support