PLDR-LLMs在自组织临界性下的推理机制

摘要

我们研究发现，在自组织临界状态下预训练的PLDR-LLMs在推理过程中会展现出推理能力。处于临界状态时，PLDR-LLM演绎输出的特征类似于二阶相变：关联长度发散，演绎输出达到亚稳态平衡。这种稳态行为表明，演绎输出从训练数据集中学习了等同于标度函数、普适性类别和重整化群的表示，在此过程中形成了泛化与推理能力。据此我们可以根据模型演绎输出参数的全局统计量定义序参量。当PLDR-LLM在临界点的序参量趋近于零时，其推理能力更优。这一发现得到了近临界和亚临界训练模型基准测试分数的支持。我们的研究结果完整阐释了大型语言模型中推理能力的形成机制，并证明通过稳态演绎输出的全局模型参数值即可量化推理能力，无需借助归纳输出在精心设计的基准数据集上进行推理与理解能力评估。

English

We show that PLDR-LLMs pretrained at self-organized criticality exhibit reasoning at inference time. The characteristics of PLDR-LLM deductive outputs at criticality is similar to second-order phase transitions. At criticality, the correlation length diverges, and the deductive outputs attain a metastable steady state. The steady state behaviour suggests that deductive outputs learn representations equivalent to scaling functions, universality classes and renormalization groups from the training dataset, leading to generalization and reasoning capabilities in the process. We can then define an order parameter from the global statistics of the model's deductive output parameters at inference. The reasoning capabilities of a PLDR-LLM is better when its order parameter is close to zero at criticality. This observation is supported by the benchmark scores of the models trained at near-criticality and sub-criticality. Our results provide a self-contained explanation on how reasoning manifests in large language models, and the ability to reason can be quantified solely from global model parameter values of the deductive outputs at steady state, without any need for evaluation of curated benchmark datasets through inductive output for reasoning and comprehension.

PLDR-LLMs在自组织临界性下的推理机制

PLDR-LLMs Reason At Self-Organized Criticality

摘要

Support