层次频率标记探针(HFTP):一种统一方法用于探究大语言模型与人脑中的句法结构表征
Hierarchical Frequency Tagging Probe (HFTP): A Unified Approach to Investigate Syntactic Structure Representations in Large Language Models and the Human Brain
October 15, 2025
作者: Jingmin An, Yilong Song, Ruolin Yang, Nai Ding, Lingxi Lu, Yuxuan Wang, Wei Wang, Chu Zhuang, Qian Wang, Fang Fang
cs.AI
摘要
大型语言模型(LLMs)展现出与人类相当甚至更优的语言能力,能够有效建模句法结构,然而其背后的具体计算模块仍不明确。一个核心问题是,LLM的行为能力是否源于与人类大脑相似的机制。为解答这些问题,我们引入了层次频率标记探针(HFTP),这一工具利用频域分析来识别LLM中编码句法结构的神经元级组件(如单个多层感知机(MLP)神经元)以及通过颅内记录获取的皮层区域。我们的研究结果表明,诸如GPT-2、Gemma、Gemma 2、Llama 2、Llama 3.1和GLM-4等模型在相似的层次处理句法,而人脑则依赖不同的皮层区域处理不同层次的句法。表征相似性分析显示,LLM的表征与大脑左半球(主导语言处理)有更强的对应关系。值得注意的是,升级版模型呈现出不同的趋势:Gemma 2比Gemma更接近大脑,而Llama 3.1与大脑的对应性则低于Llama 2。这些发现为LLM行为改进的可解释性提供了新视角,引发了关于这些进步是由类人还是非类人机制驱动的疑问,并确立了HFTP作为连接计算语言学和认知神经科学的重要工具。本项目可在https://github.com/LilTiger/HFTP获取。
English
Large Language Models (LLMs) demonstrate human-level or even superior
language abilities, effectively modeling syntactic structures, yet the specific
computational modules responsible remain unclear. A key question is whether LLM
behavioral capabilities stem from mechanisms akin to those in the human brain.
To address these questions, we introduce the Hierarchical Frequency Tagging
Probe (HFTP), a tool that utilizes frequency-domain analysis to identify
neuron-wise components of LLMs (e.g., individual Multilayer Perceptron (MLP)
neurons) and cortical regions (via intracranial recordings) encoding syntactic
structures. Our results show that models such as GPT-2, Gemma, Gemma 2, Llama
2, Llama 3.1, and GLM-4 process syntax in analogous layers, while the human
brain relies on distinct cortical regions for different syntactic levels.
Representational similarity analysis reveals a stronger alignment between LLM
representations and the left hemisphere of the brain (dominant in language
processing). Notably, upgraded models exhibit divergent trends: Gemma 2 shows
greater brain similarity than Gemma, while Llama 3.1 shows less alignment with
the brain compared to Llama 2. These findings offer new insights into the
interpretability of LLM behavioral improvements, raising questions about
whether these advancements are driven by human-like or non-human-like
mechanisms, and establish HFTP as a valuable tool bridging computational
linguistics and cognitive neuroscience. This project is available at
https://github.com/LilTiger/HFTP.