fMRI中的语言编码模型的规模定律
Scaling laws for language encoding models in fMRI
May 19, 2023
作者: Richard Antonello, Aditya Vaidya, Alexander G. Huth
cs.AI
摘要
基于Transformer的单向语言模型生成的表示被认为在预测大脑对自然语言的响应方面非常有效。然而,大多数比较语言模型和大脑的研究都使用了GPT-2或类似规模的语言模型。在这里,我们测试了来自OPT和LLaMA系列等更大规模的开源模型是否更适合预测使用fMRI记录的大脑响应。与其他情境中的扩展结果相一致,我们发现大脑预测性能随着模型规模从1.25亿到300亿参数模型呈对数线性扩展,通过与一个留存测试集的相关性衡量,跨3个受试者,编码性能提高了约15%。当扩展fMRI训练集的规模时,也观察到了类似的对数线性行为。我们还对使用HuBERT、WavLM和Whisper的声学编码模型进行了扩展特征化,发现模型规模增大时性能也有相应提升。对这些大规模、高性能编码模型的噪声上限分析显示,对于前扣带和更高级听觉皮层等大脑区域,性能接近理论最大值。这些结果表明,在模型和数据规模均扩大的情况下,将产生极其有效的大脑语言处理模型,有助于更好地科学理解以及解码等应用。
English
Representations from transformer-based unidirectional language models are
known to be effective at predicting brain responses to natural language.
However, most studies comparing language models to brains have used GPT-2 or
similarly sized language models. Here we tested whether larger open-source
models such as those from the OPT and LLaMA families are better at predicting
brain responses recorded using fMRI. Mirroring scaling results from other
contexts, we found that brain prediction performance scales log-linearly with
model size from 125M to 30B parameter models, with ~15% increased encoding
performance as measured by correlation with a held-out test set across 3
subjects. Similar log-linear behavior was observed when scaling the size of the
fMRI training set. We also characterized scaling for acoustic encoding models
that use HuBERT, WavLM, and Whisper, and we found comparable improvements with
model size. A noise ceiling analysis of these large, high-performance encoding
models showed that performance is nearing the theoretical maximum for brain
areas such as the precuneus and higher auditory cortex. These results suggest
that increasing scale in both models and data will yield incredibly effective
models of language processing in the brain, enabling better scientific
understanding as well as applications such as decoding.Summary
AI-Generated Summary