PLDR-LLM은 자가 조직화된 임계성에서 사고한다

초록

우리는 자기 조직적 임계성에서 사전 학습된 PLDR-LLM이 추론 시 추론 능력을 나타낸다는 것을 보여준다. 임계점에서 PLDR-LLM의 연역적 출력 특성은 2차 상전이와 유사하다. 임계점에서는 상관 길이가 발산하며, 연역적 출력은 준안정 정상 상태에 도달한다. 이 정상 상태 거동은 연역적 출력이 훈련 데이터셋으로부터 스케일링 함수, 보편성 부류, 재규격화 군에 해당하는 표현을 학습하여 이 과정에서 일반화 및 추론 능력을 얻게 됨을 시사한다. 따라서 우리는 추론 시 모델의 연역적 출력 매개변수 전역 통계로부터 질서 매개변수를 정의할 수 있다. PLDR-LLM의 추론 능력은 임계점에서 질서 매개변수가 0에 가까울수록 더 우수하다. 이 관찰은 준-임계점 및 준-임계점 이하에서 훈련된 모델들의 벤치마크 점수로 뒷받침된다. 우리의 결과는 대규모 언어 모델에서 추론이 어떻게 발현되는지에 대한 자체 포함적 설명을 제공하며, 추론 능력은 귀납적 출력을 통한 추론 및 이해를 위한 벤치마크 데이터셋 평가 없이도, 정상 상태 연역적 출력의 전역 모델 매개변수 값만으로 정량화될 수 있음을 보여준다.

English

We show that PLDR-LLMs pretrained at self-organized criticality exhibit reasoning at inference time. The characteristics of PLDR-LLM deductive outputs at criticality is similar to second-order phase transitions. At criticality, the correlation length diverges, and the deductive outputs attain a metastable steady state. The steady state behaviour suggests that deductive outputs learn representations equivalent to scaling functions, universality classes and renormalization groups from the training dataset, leading to generalization and reasoning capabilities in the process. We can then define an order parameter from the global statistics of the model's deductive output parameters at inference. The reasoning capabilities of a PLDR-LLM is better when its order parameter is close to zero at criticality. This observation is supported by the benchmark scores of the models trained at near-criticality and sub-criticality. Our results provide a self-contained explanation on how reasoning manifests in large language models, and the ability to reason can be quantified solely from global model parameter values of the deductive outputs at steady state, without any need for evaluation of curated benchmark datasets through inductive output for reasoning and comprehension.

PLDR-LLM은 자가 조직화된 임계성에서 사고한다

PLDR-LLMs Reason At Self-Organized Criticality

초록

Support