对齐如何提升大语言模型的多语言能力?从语言神经元视角解析
How does Alignment Enhance LLMs' Multilingual Capabilities? A Language Neurons Perspective
May 27, 2025
作者: Shimao Zhang, Zhejian Lai, Xiang Liu, Shuaijie She, Xiao Liu, Yeyun Gong, Shujian Huang, Jiajun Chen
cs.AI
摘要
多语言对齐是一种有效且具代表性的范式,旨在增强大语言模型(LLMs)的多语言能力,通过将高资源语言的能力迁移至低资源语言。同时,针对语言特异性神经元的研究揭示,在处理不同语言时,LLMs中存在选择性激活的语言特异性神经元。这为在更具体的多语言场景下分析和理解LLMs的机制提供了新视角。在本研究中,我们提出了一种新的细粒度神经元识别算法,该算法能够检测语言神经元(包括语言特异性神经元和语言相关神经元)以及语言无关神经元。进一步地,基于不同类型神经元的分布特征,我们将LLMs在多语言推理中的内部过程划分为四个部分:(1)多语言理解,(2)共享语义空间推理,(3)多语言输出空间转换,以及(4)词汇空间输出。此外,我们系统性地分析了模型在对齐前后不同类型神经元的变化,并探讨了“自发多语言对齐”现象。总体而言,我们的工作基于不同类型神经元进行了全面研究,为更好地理解LLMs的多语言对齐及多语言能力提供了实证结果和宝贵见解。
English
Multilingual Alignment is an effective and representative paradigm to enhance
LLMs' multilingual capabilities, which transfers the capabilities from the
high-resource languages to the low-resource languages. Meanwhile, some
researches on language-specific neurons reveal that there are language-specific
neurons that are selectively activated in LLMs when processing different
languages. This provides a new perspective to analyze and understand LLMs'
mechanisms more specifically in multilingual scenarios. In this work, we
propose a new finer-grained neuron identification algorithm, which detects
language neurons~(including language-specific neurons and language-related
neurons) and language-agnostic neurons. Furthermore, based on the
distributional characteristics of different types of neurons, we divide the
LLMs' internal process for multilingual inference into four parts: (1)
multilingual understanding, (2) shared semantic space reasoning, (3)
multilingual output space transformation, and (4) vocabulary space outputting.
Additionally, we systematically analyze the models before and after alignment
with a focus on different types of neurons. We also analyze the phenomenon of
''Spontaneous Multilingual Alignment''. Overall, our work conducts a
comprehensive investigation based on different types of neurons, providing
empirical results and valuable insights for better understanding multilingual
alignment and multilingual capabilities of LLMs.Summary
AI-Generated Summary