ChatPaper.aiChatPaper

揭示指令特定神经元与专家:大语言模型指令遵循能力的分析框架

Unveiling Instruction-Specific Neurons & Experts: An Analytical Framework for LLM's Instruction-Following Capabilities

May 27, 2025
作者: Junyan Zhang, Yubo Gao, Yibo Yan, Jungang Li, Zhaorui Hou, Sicheng Tao, Shuliang Liu, Song Dai, Yonghua Hei, Junzhuo Li, Xuming Hu
cs.AI

摘要

大型语言模型(LLMs)的微调显著提升了其指令遵循能力,然而推动这些改进的底层计算机制仍鲜为人知。本研究通过分离并分析指令特定的稀疏组件——即密集模型中的神经元以及混合专家(MoE)架构中的神经元与专家——系统地探讨了微调如何重新配置LLM的计算。特别地,我们引入了HexaInst,一个精心策划且平衡的指令数据集,涵盖六个不同类别,并提出了SPARCOM这一新颖的分析框架,该框架包含三项关键贡献:(1)识别这些稀疏组件的方法,(2)评估其功能通用性与独特性,以及(3)系统比较其变化。通过实验,我们展示了这些组件在指令执行中的功能通用性、独特性及其关键作用。通过阐明微调引发的适应与稀疏计算基础之间的关系,本研究为可信赖的LLM社区深入理解LLMs如何内化指令遵循行为提供了更深刻的洞见。
English
The finetuning of Large Language Models (LLMs) has significantly advanced their instruction-following capabilities, yet the underlying computational mechanisms driving these improvements remain poorly understood. This study systematically examines how fine-tuning reconfigures LLM computations by isolating and analyzing instruction-specific sparse components, i.e., neurons in dense models and both neurons and experts in Mixture-of-Experts (MoE) architectures. In particular, we introduce HexaInst, a carefully curated and balanced instructional dataset spanning six distinct categories, and propose SPARCOM, a novel analytical framework comprising three key contributions: (1) a method for identifying these sparse components, (2) an evaluation of their functional generality and uniqueness, and (3) a systematic comparison of their alterations. Through experiments, we demonstrate functional generality, uniqueness, and the critical role of these components in instruction execution. By elucidating the relationship between fine-tuning-induced adaptations and sparse computational substrates, this work provides deeper insights into how LLMs internalize instruction-following behavior for the trustworthy LLM community.

Summary

AI-Generated Summary

PDF21May 29, 2025