PLDR-LLMは自己組織化臨界状態で推論する

要旨

自己組織化臨界状態で事前学習されたPLDR-LLMが、推論時に推論能力を発現することを示す。臨界点におけるPLDR-LLMの演繹的出力特性は、二次相転移に類似している。臨界点では相関長が発散し、演繹的出力は準安定な定常状態に達する。この定常状態の振る舞いは、演繹的出力が訓練データセットからスケーリング関数・普遍性クラス・くりこみ群に相当する表現を学習し、その過程で一般化能力と推論能力を獲得することを示唆する。そこで、推論時のモデルの演繹的出力パラメータの大域的統計量から秩序変数を定義できる。PLDR-LLMの推論能力は、臨界点において秩序変数がゼロに近いほど優れている。この知見は、準臨界状態および亜臨界状態で訓練されたモデルのベンチマークスコアによって支持される。本研究の結果は、大規模言語モデルにおいて推論能力が如何に発現するかについて自己完結的な説明を提供する。すなわち、推論能力は、帰納的出力による精選されたベンチマークデータセットの評価を一切必要とせず、定常状態における演繹的出力の大域的モデルパラメータ値のみから定量化可能である。

English

We show that PLDR-LLMs pretrained at self-organized criticality exhibit reasoning at inference time. The characteristics of PLDR-LLM deductive outputs at criticality is similar to second-order phase transitions. At criticality, the correlation length diverges, and the deductive outputs attain a metastable steady state. The steady state behaviour suggests that deductive outputs learn representations equivalent to scaling functions, universality classes and renormalization groups from the training dataset, leading to generalization and reasoning capabilities in the process. We can then define an order parameter from the global statistics of the model's deductive output parameters at inference. The reasoning capabilities of a PLDR-LLM is better when its order parameter is close to zero at criticality. This observation is supported by the benchmark scores of the models trained at near-criticality and sub-criticality. Our results provide a self-contained explanation on how reasoning manifests in large language models, and the ability to reason can be quantified solely from global model parameter values of the deductive outputs at steady state, without any need for evaluation of curated benchmark datasets through inductive output for reasoning and comprehension.

PLDR-LLMは自己組織化臨界状態で推論する

PLDR-LLMs Reason At Self-Organized Criticality

要旨

Support