同じでも異なる：多言語言語モデリングにおける構造の類似点と相違点

要旨

機械的解釈可能性から新しいツールを用いて、大規模言語モデル（LLMs）の内部構造が、それらが訓練された言語の基盤となる言語構造と対応しているかどうかを問う。具体的には、（1）2つの言語が同じ形態－統語的プロセスを使用する場合、LLMsはそれらを共有の内部回路を用いて処理するか？そして（2）2つの言語が異なる形態－統語的プロセスを必要とする場合、LLMsは異なる内部回路を用いてそれらを処理するか？英語と中国語の多言語および単言語モデルを用いて、2つのタスクに関与する内部回路を分析する。我々は、モデルが同じ構文プロセスを処理するために同じ回路を使用し、それが発生する言語に関係なく、さらに、完全に独立して訓練された単言語モデルにも当てはまる証拠を見つける。さらに、多言語モデルが、一部の言語にのみ存在する言語プロセス（例：形態的マーキング）を処理するために必要な場合、言語固有の構成要素（注意ヘッドおよびフィードフォワードネットワーク）を使用することを示す。これらの結果は、LLMsが複数の言語を同時にモデリングする際に、共通の構造を活用するとともに言語の違いを維持する方法について新しい洞察を提供する。

English

We employ new tools from mechanistic interpretability in order to ask whether the internal structure of large language models (LLMs) shows correspondence to the linguistic structures which underlie the languages on which they are trained. In particular, we ask (1) when two languages employ the same morphosyntactic processes, do LLMs handle them using shared internal circuitry? and (2) when two languages require different morphosyntactic processes, do LLMs handle them using different internal circuitry? Using English and Chinese multilingual and monolingual models, we analyze the internal circuitry involved in two tasks. We find evidence that models employ the same circuit to handle the same syntactic process independently of the language in which it occurs, and that this is the case even for monolingual models trained completely independently. Moreover, we show that multilingual models employ language-specific components (attention heads and feed-forward networks) when needed to handle linguistic processes (e.g., morphological marking) that only exist in some languages. Together, our results provide new insights into how LLMs trade off between exploiting common structures and preserving linguistic differences when tasked with modeling multiple languages simultaneously.

同じでも異なる：多言語言語モデリングにおける構造の類似点と相違点

The Same But Different: Structural Similarities and Differences in Multilingual Language Modeling

要旨

Support