超越表面:探究跨尺度和層級的 LLaMA
Beyond Surface: Probing LLaMA Across Scales and Layers
December 7, 2023
作者: Nuo Chen, Ning Wu, Shining Liang, Ming Gong, Linjun Shou, Dongmei Zhang, Jia Li
cs.AI
摘要
本文對大型語言模型(LLMs)進行了深入分析,專注於LLaMA,這是自然語言處理中一個知名的開源基礎模型。我們並未通過評估LLaMA的生成輸出來評估它,而是設計了多重選擇任務,以探究其在高階任務(如推理和計算)中的內在理解。我們水平地檢視模型,比較不同大小,垂直地評估不同層次。我們根據設計的探測任務揭示了幾個關鍵且不尋常的發現:(1)水平方面,擴大模型大小幾乎不能自動帶來額外知識或計算能力。相反,它可以增強推理能力,特別是在數學問題解決方面,有助於減少幻覺,但僅在特定大小閾值之上;(2)在垂直分析中,LLaMA的較低層缺乏實質算術和事實知識,展示了邏輯思維、多語言和認知能力,而頂層則擁有大部分計算能力和現實世界知識。
English
This paper presents an in-depth analysis of Large Language Models (LLMs),
focusing on LLaMA, a prominent open-source foundational model in natural
language processing. Instead of assessing LLaMA through its generative output,
we design multiple-choice tasks to probe its intrinsic understanding in
high-order tasks such as reasoning and computation. We examine the model
horizontally, comparing different sizes, and vertically, assessing different
layers. We unveil several key and uncommon findings based on the designed
probing tasks: (1) Horizontally, enlarging model sizes almost could not
automatically impart additional knowledge or computational prowess. Instead, it
can enhance reasoning abilities, especially in math problem solving, and helps
reduce hallucinations, but only beyond certain size thresholds; (2) In vertical
analysis, the lower layers of LLaMA lack substantial arithmetic and factual
knowledge, showcasing logical thinking, multilingual and recognitive abilities,
with top layers housing most computational power and real-world knowledge.